[HN Gopher] An Empirical Study of Mamba-Based Language Models
       ___________________________________________________________________
        
       An Empirical Study of Mamba-Based Language Models
        
       Author : panabee
       Score  : 32 points
       Date   : 2024-06-13 17:57 UTC (5 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | downvotetruth wrote:
       | Have to redo this after https://github.com/state-
       | spaces/mamba/issues/175 is completed.
        
       | jiggawatts wrote:
       | What's the largest Mamba model that has been trained so far?
       | 
       | Seems like it scales better than transformers, but this would
       | only be really obvious at parameter counts far in excess of the
       | experiments in this paper.
        
       ___________________________________________________________________
       (page generated 2024-06-13 23:00 UTC)