[HN Gopher] Log-Linear Attention
       ___________________________________________________________________
        
       Log-Linear Attention
        
       Author : sva_
       Score  : 18 points
       Date   : 2025-06-07 16:01 UTC (7 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | iknownothow wrote:
       | > Log-linear attention replaces the fixed-size hidden state with
       | a logarithmically growing set of hidden states
       | 
       | Does this mean the models can be smaller too (on top of the
       | primary benefit of being faster)?
        
         | Lerc wrote:
         | Reduced memory consumption for context perhaps, but hidden
         | state is different from weights. I don't think this would
         | improve the model's capability per model parameter (but as with
         | everything with ML, I wouldn't bet against it until it's been
         | tested)
        
       | btilly wrote:
       | I think it would be very good if they can make this work. I
       | suspect that we do something not entirely unlike this, and that
       | is why spaced repetition is so good for stuffing things into our
       | long term memories.
        
       ___________________________________________________________________
       (page generated 2025-06-07 23:01 UTC)