[HN Gopher] Prediction Games
       ___________________________________________________________________
        
       Prediction Games
        
       Author : todsacerdoti
       Score  : 58 points
       Date   : 2025-02-05 15:40 UTC (7 hours ago)
        
 (HTM) web link (www.argmin.net)
 (TXT) w3m dump (www.argmin.net)
        
       | pitt1980 wrote:
       | How many $1 million prizes were given out?
        
       | optimalsolver wrote:
       | >This is a bitter lesson about the interplay between techlash
       | activism and big tech power structures. Twenty years of privacy
       | complaints have only made tech companies more powerful.
       | 
       | So we should have done what, exactly, Ben?
        
       | derbOac wrote:
       | I have a hard time keeping up with the literature on this and
       | it's not exactly my area of research, but the "overfitting is ok"
       | always seemed off and handwavy to me. It violates some pretty
       | basic information-theoretic literature, for one thing.
       | 
       | I guess it seems like parameters need to be "counted" differently
       | or there's something misunderstood about what a parameter is, or
       | whether and how it's being adjusted for somewhere. Some of the
       | gradient descent literature I've read, makes it seem like there
       | are sometimes adjustments for parameters as part of the
       | optimization process, so talking about "overfitting doesn't mean
       | anything" is misleading.
       | 
       | It just seems like something where there's a lot of imprecision
       | in terms that is critically important, no definitive explanations
       | for anything, and so forth.
       | 
       | The results are the results, but then again we have
       | hallucinations and weird adversarial probe glitches suggestive of
       | overfitting (see also e.g.,
       | http://proceedings.mlr.press/v119/rice20a). I might even suggest
       | the definition of overfitting in a DL context has been poorly
       | operationalized. Sure you can have a training and a test set, but
       | if the test set isn't sufficiently differentiated from the
       | training set, are you going to identify overfitting? I can take
       | training and test sets with a traditional statistical model and
       | if I define the test set a certain way, minimize overfitting
       | results.
       | 
       | I guess I just feel like a lot of overfitting discussions tend to
       | feel kind of handwavy or misleading and I wish they were
       | different. The number of parameters has never really been the
       | correct metric when talking about overfitting, it just happens to
       | align nicely with the correct metric in conventional models.
        
         | sdwr wrote:
         | How does "overfitting is ok" violate information theory?
         | 
         | How are hallucinations suggestive of overfitting?
         | 
         | Overfitting is a tactical term, not a strategic one, and is
         | heavily coupled to the specific implementation.
        
           | AlotOfReading wrote:
           | I _suspect_ they 're trying to relate the pigeonhole
           | principle to overparameterization, but those pieces don't
           | really fit together into a coherent argument for me.
        
           | genewitch wrote:
           | What is the difference between a strategy and a tactic?
        
         | freeone3000 wrote:
         | The definition of overfitting is handwavy. It's a failure to
         | generalize outside of the observed data. The current batch of
         | LLMs is trained on essentially all of the internet; what would
         | something outside of the observed data even look like? What
         | does it mean there?
         | 
         | On the contrary, if a printing press controller "overfits" to
         | the printing press it's installed on, that is actually pretty
         | desirable!
         | 
         | So what are you actually trying to prevent when you want to
         | prevent "overfitting", and why?
        
           | esafak wrote:
           | All of the Internet does not include everything you can
           | extrapolate from it. When I ask it to help with my code or
           | writing, I am not asking it to reproduce anything.
        
       | kombine wrote:
       | I recently read another great post by the same author about the
       | connection between optimisation with constraints and
       | backpropagation algorithm
       | https://archives.argmin.net/2016/05/18/mates-of-costate/
       | Apparently based on older LeCun's paper.
        
       | jfkrrorj wrote:
       | > Netflix launched an open competition ... in-house
       | recommendation system by 10%.
       | 
       | It worked great for them. Current masterpiece from Netflix has 13
       | Oscar nominations! Every AI company should learn and apply this
       | lesson!
        
         | hobs wrote:
         | They pretty quickly publicly abandoned that algorithm, so they
         | may have recreated it (since the core stuff is pretty
         | reproducible as the blog states) but yeah, that competition
         | being brought up without bringing up that they abandoned it is
         | interesting.
        
       ___________________________________________________________________
       (page generated 2025-02-05 23:01 UTC)