hngopher.com

       [HN Gopher] Atlas: Learning to Optimally Memorize the Context at...
       ___________________________________________________________________
        
       Atlas: Learning to Optimally Memorize the Context at Test Time
        
       Author : og_kalu
       Score  : 26 points
       Date   : 2025-05-31 14:13 UTC (8 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | arjvik wrote:
       | The Titans papers and the Test-Time Training papers
       | (https://arxiv.org/abs/2407.04620) both have the same premise -
       | models should "learn" from their context rather than memorize
       | them. Very promising direction!
        
       | cgearhart wrote:
       | It seems like there's been a lot of progress here, but it also
       | seems like there's an elephant in the room that RNNs will
       | _always_ have worse memory than self-attention since the latter
       | always has complete access to the full context. We pay for that
       | in other ways, but it seems like the unstated hypothesis of RNNs
       | is that we believe in the long run that RNNs will be "good
       | enough" and their other performance benefits will eventually
       | prevail. I'm not convinced that humanity will ever sink
       | comparable resources into optimizing this family of models that
       | has gone into Transformers to make them practical at the scale
       | they run today.
        
       ___________________________________________________________________
       (page generated 2025-05-31 23:00 UTC)