[HN Gopher] On the Difficulty of Extrapolation with NN Scaling
       ___________________________________________________________________
        
       On the Difficulty of Extrapolation with NN Scaling
        
       Author : ericjang
       Score  : 18 points
       Date   : 2022-01-25 17:56 UTC (1 days ago)
        
 (HTM) web link (lukemetz.com)
 (TXT) w3m dump (lukemetz.com)
        
       | nerdponx wrote:
       | It seems like fancy hyperparameter optimization techniques (e.g.
       | Bayesian black-box optimization) probably don't help here either,
       | because they don't solve the problem of extrapolating _outside_
       | the range of hyperparameter values have have already been tried.
       | Is that a valid conclusion?
        
         | cgearhart wrote:
         | I think in theory those techniques should still work in the
         | sense that they give you the best prediction (for some
         | definition of "best") of the next point to test given all the
         | previous information, but the more hyperparameters you can vary
         | and the further you extrapolate from observations the more
         | likely it is that something surprising happens. You should not
         | expect a fancy tuning algorithm to anticipate surprises--
         | they're designed to do the opposite by exploiting predictable
         | trends.
        
       ___________________________________________________________________
       (page generated 2022-01-26 23:01 UTC)