[HN Gopher] Log-Linear Attention
___________________________________________________________________
Log-Linear Attention
Author : sva_
Score : 18 points
Date : 2025-06-07 16:01 UTC (7 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| iknownothow wrote:
| > Log-linear attention replaces the fixed-size hidden state with
| a logarithmically growing set of hidden states
|
| Does this mean the models can be smaller too (on top of the
| primary benefit of being faster)?
| Lerc wrote:
| Reduced memory consumption for context perhaps, but hidden
| state is different from weights. I don't think this would
| improve the model's capability per model parameter (but as with
| everything with ML, I wouldn't bet against it until it's been
| tested)
| btilly wrote:
| I think it would be very good if they can make this work. I
| suspect that we do something not entirely unlike this, and that
| is why spaced repetition is so good for stuffing things into our
| long term memories.
___________________________________________________________________
(page generated 2025-06-07 23:01 UTC)