[HN Gopher] Just Ask for Generalization (2021)
___________________________________________________________________
Just Ask for Generalization (2021)
Author : jxmorris12
Score : 31 points
Date : 2025-07-03 20:31 UTC (2 days ago)
(HTM) web link (evjang.com)
(TXT) w3m dump (evjang.com)
| xg15 wrote:
| (2021), still very interesting. Especially the "post-overfitting"
| training strategy is unexpected.
| luckystarr wrote:
| I remember vaguely that this was observed when training GPT-3
| (probably?) as well. Just trained on and on, and the error went
| up and then down again. Like a phase transition in the model.
| esafak wrote:
| The low sample efficiency of RL is well explained.
___________________________________________________________________
(page generated 2025-07-05 23:01 UTC)