[HN Gopher] On the Difficulty of Extrapolation with NN Scaling
___________________________________________________________________
On the Difficulty of Extrapolation with NN Scaling
Author : ericjang
Score : 18 points
Date : 2022-01-25 17:56 UTC (1 days ago)
(HTM) web link (lukemetz.com)
(TXT) w3m dump (lukemetz.com)
| nerdponx wrote:
| It seems like fancy hyperparameter optimization techniques (e.g.
| Bayesian black-box optimization) probably don't help here either,
| because they don't solve the problem of extrapolating _outside_
| the range of hyperparameter values have have already been tried.
| Is that a valid conclusion?
| cgearhart wrote:
| I think in theory those techniques should still work in the
| sense that they give you the best prediction (for some
| definition of "best") of the next point to test given all the
| previous information, but the more hyperparameters you can vary
| and the further you extrapolate from observations the more
| likely it is that something surprising happens. You should not
| expect a fancy tuning algorithm to anticipate surprises--
| they're designed to do the opposite by exploiting predictable
| trends.
___________________________________________________________________
(page generated 2022-01-26 23:01 UTC)