[HN Gopher] M1: Towards Scalable Test-Time Compute with Mamba Re...
       ___________________________________________________________________
        
       M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
        
       Author : dpstart01
       Score  : 26 points
       Date   : 2025-04-15 17:00 UTC (6 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | ed wrote:
       | Interesting direction for research but not a model you'd want to
       | use today. The paper looks at a 3b model built on llama3.2-3b,
       | modified for mamba, and they're comparing to a distilled version
       | of r1 with 1.5b params.
        
       | solomatov wrote:
       | Does anyone know if there were any attempts to test Mamba on
       | really large scale? To me this model looks as the most promising
       | successor to the transformer architecture. Does anyone know why
       | it might not be the case or what are other alternatives?
        
       ___________________________________________________________________
       (page generated 2025-04-15 23:01 UTC)