[HN Gopher] An Empirical Study of Mamba-Based Language Models
___________________________________________________________________
An Empirical Study of Mamba-Based Language Models
Author : panabee
Score : 32 points
Date : 2024-06-13 17:57 UTC (5 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| downvotetruth wrote:
| Have to redo this after https://github.com/state-
| spaces/mamba/issues/175 is completed.
| jiggawatts wrote:
| What's the largest Mamba model that has been trained so far?
|
| Seems like it scales better than transformers, but this would
| only be really obvious at parameter counts far in excess of the
| experiments in this paper.
___________________________________________________________________
(page generated 2024-06-13 23:00 UTC)