[HN Gopher] Chinchilla's Wild Implications
___________________________________________________________________
Chinchilla's Wild Implications
Author : ctoth
Score : 30 points
Date : 2022-08-02 17:14 UTC (5 hours ago)
(HTM) web link (www.lesswrong.com)
(TXT) w3m dump (www.lesswrong.com)
| gfodor wrote:
| My understanding is all of these models trained one epoch. Not
| knowing the change in loss in these LM's by doing a second pass
| seems like a huge blind spot. What's the best example to point to
| that could help us understand how likely it is that would move
| the needle comparable to adding more unique tokens?
| visarga wrote:
| TL;DR - In large language models what matters most is the data,
| not the model size. And data is finite.
___________________________________________________________________
(page generated 2022-08-02 23:01 UTC)