hngopher.com

       [HN Gopher] Gemma 3 270M: Compact model for hyper-efficient AI
       ___________________________________________________________________
        
       Gemma 3 270M: Compact model for hyper-efficient AI
        
       Author : meetpateltech
       Score  : 481 points
       Date   : 2025-08-14 16:08 UTC (6 hours ago)
        
 (HTM) web link (developers.googleblog.com)
 (TXT) w3m dump (developers.googleblog.com)
        
       | canyon289 wrote:
       | Hi all, I built these models with a great team. They're available
       | for download across the open model ecosystem so give them a try!
       | I built these models with a great team and am thrilled to get
       | them out to you.
       | 
       | From our side we designed these models to be strong for their
       | size out of the box, and with the goal you'll all finetune it for
       | your use case. With the small size it'll fit on a wide range of
       | hardware and cost much less to finetune. You can try finetuning
       | them yourself in a free colab in under 5 minutes
       | 
       | For picking a Gemma size this is a video I recorded for the 1b to
       | 27b sizes earlier this year, 270m being the newest addition
       | 
       | https://www.youtube.com/watch?v=qcjrduz_YS8
       | 
       | Hacker News Disclaimer I really like working at Google so with
       | that; All my opinions here are my own, I'm a researcher so I'll
       | largely focus on technical questions, and I'll share what I can.
        
         | ActorNightly wrote:
         | How does the 270 perform with coding?
         | 
         | I use Gemma27b currently with a custom agent wrapper and its
         | working pretty well.
        
           | chrismustcode wrote:
           | I'd be stunned if a 270m model could code with any
           | proficiency.
           | 
           | If you have an iPhone with the semi-annoying autocomplete
           | that's a 34m transformer.
           | 
           | Can't imagine a model (even if it's a good team behind it) to
           | do coding with 8x the parameters of a next 3/4 word
           | autocomplete.
        
             | 0x457 wrote:
             | Someone should try this on that model:
             | https://www.oxen.ai/blog/training-a-rust-1-5b-coder-lm-
             | with-...
        
         | beoberha wrote:
         | Awesome work! I'm really bullish on small models and think they
         | have the most potential to change our daily lives. Can't wait
         | to play around with this
        
         | fibers wrote:
         | Great job. Do you know how well it performs in sanity checks
         | with NER since it is on the press release page?
        
         | tmaly wrote:
         | Are there any fine tuning in a box type options available in
         | the cloud for this? This is amazing work, thank you.
        
           | canyon289 wrote:
           | Finetuning is possible on free tier colab and 5 minutes of
           | time. Here's a tutorial
           | 
           | https://ai.google.dev/gemma/docs/core/huggingface_text_full_.
           | ..
        
         | simonw wrote:
         | Do you have any practical examples of fine-tuned variants of
         | this that you can share? A description would be great, but a
         | demo or even downloadable model weights (GGUF ideally) would be
         | even better.
        
           | canyon289 wrote:
           | We obviously need to create a pelican bicycle svg finetune ;)
           | If you want to try this out I'd be thrilled to do it with
           | you, I genuinely am curious how well this model can perform
           | if specialized on that task.
           | 
           | A couple colleagues of mine posted an example of finetuning a
           | model to take on persona's for videogame NPCs. They have
           | experience working with folks in the game industry and a use
           | case like this is suitable for game devs who want to start
           | including lightweight models that won't take up a ton of
           | accelerator memory and can run efficiently on CPU if needed. 
           | https://ai.google.dev/gemma/docs/core/huggingface_text_full_.
           | ..
           | 
           | As for GGUF it's available here!
           | https://huggingface.co/collections/ggml-
           | org/gemma-3-270m-689...
        
             | megaman821 wrote:
             | What size of tasks can this handle? Can you do a fine-tune
             | of Mac System Settings?
        
               | canyon289 wrote:
               | 32k context window so whatever fits in there. What is a
               | finetune of mac system settings?
        
               | megaman821 wrote:
               | The finetune would be an LLM where you say something like
               | "my colors on the screen look to dark" and then it points
               | you to Displays -> Brightness. It feels like a relatively
               | constrained problem like finding the system setting that
               | solves your problem is a good fit for a tiny LLM.
        
               | canyon289 wrote:
               | This would be a great experiment. I'm not sure how the OS
               | integration would work, but as a first pass you could try
               | finetuning the model to take natural language "my colors
               | on the screen look to dark" and then have it output
               | "Displays -> Brightness", then expand to the various
               | other paths you would like the model to understand
        
             | mrbonner wrote:
             | Do you know that hardware required to fine-tune this model?
             | I'm asking on behave of us GPU starve folks
        
               | canyon289 wrote:
               | A free colab. Here's a link, you can finetune the model
               | in ~5 minutes in this example, and I encourage you to try
               | your own
               | 
               | https://ai.google.dev/gemma/docs/core/huggingface_text_fu
               | ll_...
        
               | punnerud wrote:
               | Finally a Google guide using PyTorch and not Tensorflow,
               | that alone made me wanting to try it out ;)
        
             | srekhi wrote:
             | video game NPCs with intelligence :O gaming is going to be
             | crazy
        
         | cgdl wrote:
         | Very cool. For the INT4 QAT model, what is the recommended
         | precision for the activations and for the key and values stored
         | in KV cache?
        
           | hnuser123456 wrote:
           | For keys, you probably want to use at least q5 or q6, for
           | values q4 is fine
        
         | VirusNewbie wrote:
         | hi Ravin, fellow Googler here. Curious if you can share here
         | (or internally?) how these models were trained. Wondering if
         | you face all the chaos the large models have during training?
        
           | canyon289 wrote:
           | Reach out to me internally
        
         | NorwegianDude wrote:
         | The Gemma 3 models are great! One of the few models that can
         | write Norwegian decently, and the instruction following is in
         | my opinion good for most cases. I do however have some issues
         | that might be related to censorship that I hope will be fixed
         | if there is ever a Gemma 4. Maybe you have some insight into
         | why this is happening?
         | 
         | I run a game when players can post messages, it's a game where
         | players can kill each other, and people often send threats
         | along the lines of "I will kill you". Telling Gemma that it
         | should classify a message as game related or a real life
         | threat, and that it is for a message in a game where players
         | can kill each other and threats are a part of the game, and
         | that it should mark it as game related if it is unclear if the
         | message is a game related threat or a real life threat does not
         | work well. For other similar tasks it seems to follow
         | instructions well, but for serious topics it seems to be very
         | biased, and often err on the side of caution, despite being
         | told not to. Sometimes it even spits out some help lines to
         | contact.
         | 
         | I guess this is because it was trained to be safe, and that
         | affects it's ability to follow instructions for this? Or am I
         | completely off here?
        
           | whymauri wrote:
           | LLMs are really annoying to use for moderation and Trust and
           | Safety. You either depend on super rate-limited 'no-
           | moderation' endpoints (often running older, slower models at
           | a higher price) or have to tune bespoke un-aligned models.
           | 
           | For your use case, you should probably fine tune the model to
           | reduce the rejection rate.
        
             | canyon289 wrote:
             | Speaking for me as an individual as an individual I also
             | strive to build things that are safe AND useful. Its quite
             | challenging to get this mix right, especially at the 270m
             | size and with varying user need.
             | 
             | My advice here is make the model your own. Its open weight,
             | I encourage it to be make it useful for your use case and
             | your users, and beneficial for society as well. We did our
             | best to give you a great starting point, and for Norwegian
             | in particular we intentionally kept the large embedding
             | table to make adaption to larger vocabularies easier.
        
               | whymauri wrote:
               | To be fair, Trust and Safety workloads are edgecases
               | w.r.t. the riskiness profile of the content. So in that
               | sense, I get it.
        
               | sheepdestroyer wrote:
               | I don't. "safety" as it exists really feels like
               | infantilization, condescention, hand holding and
               | enforcement of American puritanism. It's insulting.
               | 
               | Safety should really just be a system prompt: "hey you
               | potentially answer to kids, be PG13"
        
               | ungreased0675 wrote:
               | Safety in the context of LLMs means "avoiding bad media
               | coverage or reputation damage for the parent company"
               | 
               | It has only a tangential relationship with end user
               | safety.
               | 
               | If some of these companies are successful the way they
               | imagine, most of their end users will be unemployed. When
               | they talk about safety, it's the companies safety they're
               | referring to.
        
           | kevinventullo wrote:
           | Perhaps you can do some pre-processing before the LLM sees
           | it, e.g. replacing every instance of "kill" with
           | "NorwegianDudeGameKill", and providing the specific context
           | of what the word "NorwegianDudeGameKill" means in your game.
           | 
           | Of course, it would be better for the LLM to pick up the
           | context automatically, but given what some sibling comments
           | have noted about the PR risks associated with that, you might
           | be waiting a while.
        
           | nottorp wrote:
           | I suppose it can't kill -USR1 either...
        
         | andrewstuart wrote:
         | What effort do you folks take to see your models actually
         | running on hardware such as AMD Strix Halo or Apple M3M4?
         | 
         | I get the sense that AI is at the "hobby kit computing" stage
         | where they used to dump all the components in a box and give
         | you a schematic and a soldering iron and happily say "you make
         | it work!"
         | 
         | And that worked in the early days of computing because there
         | was a small number of people really motivated for the outcome.
         | 
         | But fully assembled and packaged and tested in a nice looking
         | box is where the real demand turned out to be.
         | 
         | I'm looking forward to the day Google doesn't just dump a model
         | and say "you do the rest".
         | 
         | I want to fire up Ubuntu on a Strix Halo and say apt install
         | then load the browser interface. Or just download and run a Mac
         | installer and have it just work.
         | 
         | Arcane complex multi step build install configure processes for
         | AI need to end in favor of one click install. I'm not
         | interested in the process of making it run.
        
           | dist-epoch wrote:
           | Here you go, one click installer - https://lmstudio.ai
        
             | andrewstuart wrote:
             | I'm talking about the supplier doing the packaging.
        
               | garbageman wrote:
               | Then use ChatGPT/Gemini/Claude on your phone.
               | 
               | They are giving it away for free - if you NEED a local
               | LLM, the least you can do is spend the 2 minutes to
               | download LMSTudio and pick a model.
        
           | canyon289 wrote:
           | I don't think we dumped the model and say you do the rest?
           | 
           | Myself and my colleagues spent many days transforming the
           | weights into various open compatible formats. And its not
           | just us there's many orgs and partners dedicating their time,
           | resources, and companies to making all open models easy to
           | use.
           | 
           | I encourage you to explore the solutions provided by them. We
           | linked some in our blog post here, and there's more. They've
           | all done a fantastic job building frankly an insane amount of
           | infrastructure, documentation, and community support in the
           | last 2+ years. Some of them are here in this HN thread
           | answering questions.
        
         | ceroxylon wrote:
         | You reminded me of an awesome Google engineer I met at BSidesSF
         | last year who tirelessly answered my questions, and when I
         | clicked on the video, it was you! That was a really inspiring
         | moment for me, thank you.
        
           | canyon289 wrote:
           | BSidesSF is a fantastic event, glad you're able to attend.
           | There's some great people organize it and if you want to help
           | out they're always looking for volunteers. Happy to make an
           | intro if you like.
        
         | nerdsniper wrote:
         | What are some of the use cases that you think the 270M would be
         | most appropriate for? What would you love to see people trying
         | with it?
        
         | jmorgan wrote:
         | Amazing work. This model feels really good at one-off tasks
         | like summarization and autocomplete. I really love that you
         | released a quantized aware training version on launch day as
         | well, making it even smaller!
        
           | canyon289 wrote:
           | Thank you Jeffrey, and we're thrilled that you folks at
           | Ollama partner with us and the open model ecosystem.
           | 
           | I personally was so excited to run ollama pull gemma3:270b on
           | my personal laptop just a couple of hours ago to get this
           | model on my devices as well!
        
             | blitzar wrote:
             | > gemma3:270b
             | 
             | I think you mean gemma3:270m - Its Dos Comas not Tres Comas
        
               | freedomben wrote:
               | Maybe it's 270m after Hooli's SOTA compression algorithm
               | gets ahold of it
        
               | canyon289 wrote:
               | Ah yes thank you. Even I still instinctively type B
        
         | stefan_ wrote:
         | I suppose the odd repetition of "I built these models with a
         | great team" is to warm us up to this model slipping into
         | infinite gibberish repetition?
        
           | bastardoperator wrote:
           | My first thought was they must have used AI to write this HN
           | comment...
        
         | _1 wrote:
         | > and with the goal you'll all finetune it for your use case.
         | 
         | What use-cases are a good fit for finetuning this model? More
         | specific instruction following, knowledge from proprietary
         | data, response tone?
        
           | canyon289 wrote:
           | Any text to text use case with 32k context, especially if
           | you're starting from the PT version you can finetune it to do
           | whatever you need
        
         | blitzar wrote:
         | > I built these models with a great team ... I built these
         | models with a great team
         | 
         | If Gemini is going to repeat something at least its that the
         | team is great, and not a disgrace!
        
         | patrickaljord wrote:
         | Would it be possible to have a specialized rust only dev or
         | Reactjs only dev while getting rid of all other languages to
         | minimize size of model?
        
         | imasl42 wrote:
         | Awesome! I'm curious how is the team you built these models
         | with? Is it great?
        
           | freedomben wrote:
           | Heh, what could they possibly say in answer to this? The team
           | is full of assholes? :-D
        
           | canyon289 wrote:
           | Its hard to tell over the web whether things are sarcastic or
           | not so excuse me if I misread the intent.
           | 
           | At Google I've found my colleagues to be knowledgeable, kind,
           | and collaborative and I enjoy interacting with them. This is
           | not just the folks I worked on this project with, but
           | previous colleagues in other teams as well. With this
           | particular product I've been impressed by the technical
           | knowledge folks I worked directly with, and their
           | contribution both improved the model's capability and my own.
        
             | mkl wrote:
             | I think it was a joke about you saying the team was great
             | twice in one line.
        
         | ankit219 wrote:
         | This is super cool. Usually you dont see effective models at
         | 270M out in the wild. The architectural choices are new and
         | interesting as well.
         | 
         | Would it be okay for you to divulge some more training
         | information here? With 170M embedding parameters, how do you
         | ensure no embedding collapse and keeping the embedding matrix
         | stable at training time?
         | 
         | (i know i am asking too much, but just curious). There is a
         | clear trade off for you with vocab / transformer layers. How
         | did you arrive at the split of 170m/100m. Does this contribute
         | to model's performance on task specific fine tuning? Any
         | internal experiments you could share? or public info you could
         | point us to? Anything would be amazing.
         | 
         | PS: I am sorry if this is rude, but this has so many decisions
         | i am curious about. Not intending to undermine anything, this
         | is amazing work, and thank you for the whole Gemma series.
        
           | canyon289 wrote:
           | Not rude at all and I'll again share what I can.
           | 
           | We ran a bunch of experimental architectures at this size to
           | get a sense of performance at this size, in particular how
           | well it was able to adapt to datasets across some loss
           | measures.
           | 
           | For the embedding size it comes from a mix of "hard
           | technical" data, like the loss measures I mentioned above,
           | and for this model it also comes from community
           | considerations such as adaptability across input tokens and
           | consistency with the gemma ecosystem. At this size you are
           | right its a bit funny the embedding is so large.
           | 
           | For more details read the Gemma3 technical report
           | https://arxiv.org/pdf/2503.19786. It doesnt cover the 270m
           | model as this was written from the 1b to 27b gemma3 release
           | but itll answer some of your questions. As for 270m we may
           | share more information in the future, Up until now we were
           | just focused on getting the model out there.
        
         | rossant wrote:
         | Is it good for text translation and summarization?
        
         | dileeparanawake wrote:
         | This is cool. For on device models any plans / models that use
         | MOE in relatively resource constrained setups (I'm thinking MBP
         | M1 16gb ram)? I'm using LM studio but all the Gemma models
         | (mlx) seem to crash but surprisingly managed to get gpt-oss 20b
         | working (slow) on my mbp.
         | 
         | I find performance in resource constrained environments
         | interesting.
         | 
         | In particular trying to find decent code models (on device
         | backup) but also tts applications and voice to text.
        
         | schyzomaniac wrote:
         | hi, congrats for the amazing work!
         | 
         | i love the 27b model, and i use it basically daily. however
         | when i tried to finetune it for a task in a low resource
         | language, unfortunately i did not succeed: lora just did not
         | picked up the gist of the task, full finetune lead to
         | catastrophic forgetting.
         | 
         | may i ask four your advice, or do you have any general tips how
         | to do that properly?
         | 
         | thanks in advance for your help :)
        
           | ActorNightly wrote:
           | Feed in Context with documentation for that language?
        
         | beefnugs wrote:
         | This appears to be a new level of "missing the plot" to me. The
         | push to make "ai for everyone" is now just blindly intertwined
         | with hyper specialized "for ai engineers only" releases.
         | 
         | Or am I so far behind that "fine tuning your own model" is
         | something a 12 year old who is married to chatGPT does now?
        
         | nh43215rgb wrote:
         | 270M is nice (and rare) addition. Is there a reason why this is
         | not categorized as gemma3n model? I thought small models go
         | under gemma3n category
        
         | riedel wrote:
         | Would be great to have it included in the Google Edge AI
         | gallery android app.
        
       | 44za12 wrote:
       | I've had great luck with all gemma 3 variants, on certain tasks
       | it the 27B quantized version has worked as well as 2.5 flash.
       | Can't wait to get my hands dirty with this one.
        
       | Alex-Programs wrote:
       | This is cool. I'm looking forward to trying it - I wonder what
       | it'll be useful for.
        
       | lemonish97 wrote:
       | Never thought I'd run an LLM released in 2025, on my phone, in
       | full BF16. With ~80tps on an iPhone 16 pro btw.
        
         | elAhmo wrote:
         | How do you actually run this on an iPhone?
        
           | CharlesW wrote:
           | With something like PocketPal AI
           | (https://github.com/a-ghorbani/pocketpal-ai). I'd love hear
           | HN'ers opinions on the "best" LM Studio-like option for iOS
           | devices.
        
             | sogen wrote:
             | For iOS, OpenCat. Has iCloud sync, and one universal app
             | for MacOS and iOS devices.
        
           | lemonish97 wrote:
           | I use PocketPal. Can run any gguf model off hf.
        
       | simonw wrote:
       | This model is a LOT of fun. It's absolutely tiny - just a 241MB
       | download - and screamingly fast, and hallucinates wildly about
       | almost everything.
       | 
       | Here's one of dozens of results I got for "Generate an SVG of a
       | pelican riding a bicycle". For this one it decided to write a
       | poem:                 +-----------------------+       |   Pelican
       | Riding Bike |       +-----------------------+       |  This is
       | the cat!  |       |  He's got big wings and a happy tail.  |
       | |  He loves to ride his bike!  |       +-----------------------+
       | |   Bike lights are shining bright.  |       |   He's got a shiny
       | top, too!  |       |   He's ready for adventure!  |
       | +-----------------------+
       | 
       | There are a bunch more attempts in this Gist, some of which do at
       | least include an SVG tag albeit one that doesn't render anything:
       | https://gist.github.com/simonw/25e7b7afd6a63a2f15db48b3a51ec...
       | 
       | I'm looking forward to seeing people fine-tune this in a way that
       | produces useful output for selected tasks, which should
       | absolutely be feasible.
        
         | marinhero wrote:
         | Serious question but if it hallucinates about almost
         | everything, what's the use case for it?
        
           | rotexo wrote:
           | An army of troll bots to shift the Overton Window?
        
             | ants_everywhere wrote:
             | oh no now we'll never hear the end of how LLMs are just
             | statistical word generators
        
           | iLoveOncall wrote:
           | Nothing, just like pretty much all models you can run on
           | consumer hardware.
        
             | cyanydeez wrote:
             | This message brought to you by OpenAI: we're useless, but
             | atleast theres a pay gate indicating quality!
        
           | striking wrote:
           | It's intended for finetuning on your actual usecase, as the
           | article shows.
        
           | simonw wrote:
           | Fine-tuning for specific tasks. I'm hoping to see some good
           | examples of that soon - the blog entry mentions things like
           | structured text extraction, so maybe something like "turn
           | this text about an event into an iCal document" might work?
        
             | turnsout wrote:
             | Google helpfully made some docs on how to fine-tune this
             | model [0]. I'm looking forward to giving it a try!
             | [0]: https://ai.google.dev/gemma/docs/core/huggingface_text
             | _full_finetune
        
             | CuriouslyC wrote:
             | Fine tuning messes with instruction following and RL'd
             | behavior. I think this is mostly going to be useful for
             | high volume pipelines doing some sort of mundane extraction
             | or transformation.
        
             | iib wrote:
             | This is exactly the fine-tuning I am hoping for, or I would
             | do if I had the skills. I tried it with gemma3 270M and
             | vanilla it fails spectacularly.
             | 
             | Basically it would be the quickadd[1] event from google
             | calendar, but calendar agnostic.
             | 
             | [1] https://developers.google.com/workspace/calendar/api/v3
             | /refe...
        
           | zamadatix wrote:
           | I feel like the blog post, and GP comment, does a good job of
           | explaining how it's built to be a small model easily fine
           | tuned for narrow tasks, rather than used for general tasks
           | out of the box. The latter is guaranteed to hallucinate
           | heavily at this size, that doesn't mean every specific task
           | it's fine tuned to would be. Some examples given were fine
           | tuning it to efficiently and quickly route a query to the
           | right place to actually be handled or tuning it to do
           | sentiment analysis of content.
           | 
           | An easily fine tunable tiny model might actually be one of
           | the better uses of local LLMs I've seen yet. Rather than try
           | to be a small model that's great at everything it's a tiny
           | model you can quickly tune to do one specific thing decently,
           | extremely fast, and locally on pretty much anything.
        
           | deadbabe wrote:
           | Games where you need NPCs to talk random jiberrish.
        
           | numpad0 wrote:
           | robotic parrots?
        
           | luckydata wrote:
           | Because that's not the job it was designed to do, and you
           | would know by reading the article.
        
           | yifanl wrote:
           | It's funny. Which is subjective, but if it fits for you, it's
           | arguably more useful than Claude.
        
           | mirekrusin wrote:
           | The same as having a goldfish. You can train it to do a trick
           | I guess.
        
         | campbel wrote:
         | Do you take requests? We need to see how well this model works
         | with some fine-tuning :D
        
         | mdp2021 wrote:
         | > _For this one it decided to write a poem_
         | 
         | Could it be tamed with good role-system prompt crafting?
         | (Besides fine-tuning.)
        
         | volkk wrote:
         | i was looking at the demo and reading the bed time story it
         | generated and even there, there was confusion about the sprite
         | and the cat. switched subjects instantly making for a confusing
         | paragraph. what's the point of this model?
        
         | ertgbnm wrote:
         | He may generate useless tokens but boy can he generate ALOT of
         | tokens.
        
           | lucb1e wrote:
           | He? I know some Gemmas and it's distinctly a female name; is
           | Gemma a boy's name where you're from?
        
             | jgalt212 wrote:
             | Perhaps the poster we referring to Simon not Gemma.
        
             | ertgbnm wrote:
             | I don't really gender LLMs in my head in general. I guess
             | Gemma is a female name. I only gendered it in the joke
             | because I think it makes it funnier, especially since it's
             | just "a little guy". I know they are giving gendered names
             | to these models now but I think it's a bit weird to gender
             | when interacting with them.
        
             | layer8 wrote:
             | Doesn't the "M" in "Gemma 3 270M" Stand for "male"?
             | 
             | Also: https://en.wikipedia.org/wiki/Gemma_Frisius
        
               | avarun wrote:
               | Not sure if that's a serious question but it stands for
               | "million". As compared to 1B+ models, where the B stands
               | for "billion" parameters.
        
           | TheJoeMan wrote:
           | Can he draw an "alot" made of tokens?
           | https://hyperboleandahalf.blogspot.com/2010/04/alot-is-
           | bette...
        
         | cyanydeez wrote:
         | the question is wheather you can make a fine tuned version and
         | spam any given forum within an hour with the most attuned but
         | garbage content.
        
         | layer8 wrote:
         | > It's absolutely tiny - just a 241MB download
         | 
         | That still requires more than 170 floppy disks for
         | installation.
        
           | freedomben wrote:
           | Indeed. Requires over 3,000,000 punch cards to store. Not
           | very tiny!
        
             | stikypad wrote:
             | On the plus side, you can decompose your matrices for free
             | using termites.
        
         | nico wrote:
         | Could be interesting to use in a RAG setup and also finetuning
         | it
         | 
         | For sure it won't generate great svgs, but it might be a really
         | good conversational model
        
           | luckydata wrote:
           | The article says it's not a good conversational model but can
           | be used for data extraction and classification as two
           | examples.
        
         | 0x00cl wrote:
         | I see you are using ollamas ggufs. By default it will download
         | Q4_0 quantization. Try `gemma3:270m-it-bf16` instead or you can
         | also use unsloth ggufs `hf.co/unsloth/gemma-3-270m-it-GGUF:16`
         | 
         | You'll get better results.
        
           | simonw wrote:
           | Good call, I'm trying that one just now in LM Studio (by
           | clicking "Use this model -> LM Studio" on
           | https://huggingface.co/unsloth/gemma-3-270m-it-GGUF and
           | selecting the F16 one).
           | 
           | (It did not do noticeably better at my pelican test).
           | 
           | Actually it's worse than that, several of my attempts
           | resulted in infinite loops spitting out the same text. Maybe
           | that GGUF is a bit broken?
        
             | danielhanchen wrote:
             | Oh :( Maybe the settings? Could you try
             | 
             | temperature = 1.0, top_k = 64, top_p = 0.95, min_p = 0.0
        
               | canyon289 wrote:
               | Daniel, thanks for being here providing technical support
               | as well. Cannot express enough how much we appreciate
               | your all work and partnership.
        
               | danielhanchen wrote:
               | Thank you and fantastic work with Gemma models!
        
               | simonw wrote:
               | My topping only lets me set temperature and top_p but
               | setting them to those values did seem to avoid the
               | infinite loops, thanks.
        
             | JLCarveth wrote:
             | I ran into the same looping issue with that model.
        
         | roughly wrote:
         | I audibly laughed at this one:
         | https://gist.github.com/simonw/25e7b7afd6a63a2f15db48b3a51ec...
         | where it generates a... poem? Song? And then proceeds to
         | explain how each line contributes to the SVG, concluding with:
         | 
         | > This SVG code provides a clear and visually appealing
         | representation of a pelican riding a bicycle in a scenic
         | landscape.
        
         | mdp2021 wrote:
         | > _For this one it decided to write a poem_
         | 
         | My first try:
         | 
         | user: "When was Julius Caesar born"
         | 
         | response: "Julius Caesar was born in **Rome**"
         | 
         | Beautiful :D
         | 
         | (I do not mean to detract from it - but it's just beautiful. It
         | will require more effort to tame it.)
        
           | mirekrusin wrote:
           | Cutting number of parameters in half is like drinking a pint
           | of beer.
        
             | stikypad wrote:
             | I think you meant vodka.
        
         | Balinares wrote:
         | This is like a kobold to the other models' dragons and I don't
         | hate it. :)
        
         | bobson381 wrote:
         | It's gonna be a customer service agent for Sirius Cybernetics.
         | Share and enjoy!
        
       | dcreater wrote:
       | I've been saying he we need sub 1B models for the edge so thanks
       | fot this.
       | 
       | I am however disappointed that there is no examples, or
       | benchmarks, provided to get a sense of performance. It's a given
       | that benchmark values would be lower than gemma 3n, but having a
       | sense of performance vs size curve and comparison to existing
       | small models is needed
        
         | floridianfisher wrote:
         | Benchmarks are on the model card
         | https://huggingface.co/google/gemma-3-270m-it
        
       | jasonjmcghee wrote:
       | I'm _very_ interested to see what this can be fine-tuned to do.
       | 
       | I've heard folks say a number of times that neuromuscular control
       | / locomotion (or w/e) are hundreds of millions of parameters
       | rather than billions.
        
       | whinvik wrote:
       | Curious. Are there real world usecases where people have
       | finetuned such tiny models and put them into production.
        
         | cyanydeez wrote:
         | 9gag.com commenter
        
         | deepsquirrelnet wrote:
         | I'm not sure what I'd use them for, except maybe tag
         | generation? Encoders of this size usually outperform by a wide
         | margin on tasks they would overlap with.
        
           | dismalaf wrote:
           | I'm making an app where literally all I want to do with an
           | LLM is generate tags. This model has failed with flying
           | colours, literally takes forever to parse anything and
           | doesn't follow instructions.
           | 
           | Edit - I should add, currently the model I'm using is Gemini
           | Flash Lite through the Gemini API. It's a really good combo
           | of fast, follows instructions, gives correct results for what
           | I want and cost-effective. I still would love a small open
           | model that can run on edge though.
        
             | deepsquirrelnet wrote:
             | Oof. I also had it refuse an instruction for "safety",
             | which was completely harmless. So that's another dimension
             | of issues with operationalizing it.
        
             | thegeomaster wrote:
             | Well, Gemini Flash Lite is at least one, or likely two
             | orders of magnitude larger than this model.
        
               | dismalaf wrote:
               | That's fair but one can dream of being able to simply run
               | a useful LLM on CPU on your own server to simplify your
               | app and save costs...
        
         | nevir wrote:
         | IIRC that Android (at least Pixel devices) use fine-tuned Gemma
         | model(s) for some on-device assistant things
        
         | marcyb5st wrote:
         | I built a reranker for a RAG system using a tiny model. After
         | the candidate generation (i.e. vector search + BM25) and
         | business logic filters/ACL checks the remainder of the chunks
         | went through a model that given the user query told you whether
         | or not the chunk was really relevant. That hit production, but
         | once the context size of models grew that particular piece was
         | discarded as passing everything yielded better results and
         | prices (the fact that prices of input tokens went down also
         | played a role I am sure).
         | 
         | So only for a while, but it still counts :)
        
       | jtbayly wrote:
       | Can somebody give me a link to a tutorial on how I would go about
       | fine-tuning this?
       | 
       | Also, what sorts of things might I consider fine-tuning it for?
        
         | simonw wrote:
         | This tutorial looks useful:
         | https://ai.google.dev/gemma/docs/core/huggingface_text_full_...
        
           | jtbayly wrote:
           | Just what I was looking for. Thanks!
        
         | JLCarveth wrote:
         | This was linked at the end of Google's announcement:
         | https://docs.unsloth.ai/basics/gemma-3-how-to-run-and-fine-t...
         | 
         | Not sure how much data is needed to realistically fine-tune
         | something like this and get useful output.
        
           | jtbayly wrote:
           | That doesn't really show me how to do fine-tuning, but there
           | is a link to a notebook in there that does. Thanks!
        
             | danielhanchen wrote:
             | If you need any help on it, ask away!
        
       | mrcwinn wrote:
       | Apple should be doing this. Unless their plan is to replace their
       | search deal with an AI deal -- it's just crazy to me how absent
       | Apple is. Tim Cook said, "it's ours to take" but they really seem
       | to be grasping at the wind right now. Go Google!
        
         | Lalabadie wrote:
         | By "this", do you mean SLM (small language models)? That's
         | absolutely something they've been working on for a good while.
        
         | syassami wrote:
         | They have a GPT2 like model that's deployed right now for
         | autocomplete https://jackcook.com/2023/09/08/predictive-
         | text.html
        
         | burnerRhodo wrote:
         | steve jobs was the innovator, steve cook is the supply chain
         | guy. They started an electric car not because they thought it
         | was a good idea, but because everyone was going to leave to
         | Tesla or rivian if they didn't. They had no direction and
         | arguements that Tesla had about whether to have a steering
         | wheel...
         | 
         | Then Siri just kinda languishes for forever, and LLM's pass the
         | torch of "Cool Tech", so they try and "Reinvigurate" the team,
         | but with no clear direction. Are they going to be a cloud
         | provider? Are they going to contract out the training? Are they
         | gunna spin up a compute facility even after neglecting to do so
         | since 2012?
         | 
         | Apple needs to just stop trying shit, and just get that app
         | store money. That's why jobs appointed cook. Jobs new cook was
         | no innovator, but he could make apple a money printing machine.
         | Thats what they should stick with.
        
           | bigyabai wrote:
           | Here's the trillion dollar question: how do you print money
           | when the president wants your hardware onshored and the rest
           | of the world wants to weaken your service revenue?
           | 
           | Solve that and you can put Tim Cook out of a job tomorrow.
        
           | andrehacker wrote:
           | I agreed with that for a bit... and then out of nowhere came
           | Apple Silicon, incredible specs, incredible backward
           | compatibility, nah, Cook is no dummy.
        
         | potatolicious wrote:
         | They do. The docs are right here:
         | https://developer.apple.com/documentation/foundationmodels/g...
         | 
         | If you have the latest betas installed you can call those APIs
         | right now.
         | 
         | They also support fine-tuning on the model that ships with
         | ~every device:
         | 
         | https://developer.apple.com/apple-intelligence/foundation-mo...
        
         | andrehacker wrote:
         | As every other thread about LLMs here on HN points out: LLMs
         | are stupid and useless as is. While I don't agree with that
         | sentiment, no company has yet found a way to "do it right" to
         | the extent that investments are justified in the long run.
         | Apple has a history of "being late" and then obliterating the
         | competition with products that are way ahead the early adopters
         | (e.g. MP3 players, smart phones, smart watches).
        
         | rs186 wrote:
         | Apple will definitely not be doing this. As can be already seen
         | in other comments, the performance of the model is not very
         | good. In fact, you can't really find a model that runs well
         | enough on a phone to provide a good user experience (meaning
         | producing tokens at a reasonable speed without making the phone
         | heat up like a potato, and it's not spitting completely
         | nonsense). Yes I have tried a few.
         | 
         | Think of Apple however you want, but they rarely ship bad/half-
         | baked products. They would rather not ship a product at all
         | than ship something that's not polished.
        
       | dismalaf wrote:
       | It's fast at spitting out nonsense but incredibly slow at trying
       | to parse any context. Also absolutely atrocious at following
       | instructions.
       | 
       | Probably would be good as a game NPC or a chatbot, not very good
       | for integrating into an application which specific functionality
       | though.
        
       | jefb wrote:
       | My lovely interaction with the 270M-F16 model:
       | 
       | > what's second tallest mountain on earth?
       | 
       | The second tallest mountain on Earth is Mount Everest.
       | 
       | > what's the tallest mountain on earth?
       | 
       | The tallest mountain on Earth is Mount Everest.
       | 
       | > whats the second tallest mountain?
       | 
       | The second tallest mountain in the world is Mount Everest.
       | 
       | > whats the third tallest mountain?
       | 
       | The third tallest mountain in the world is Mount Everest.
       | 
       | > what's the forth tallest mountain?
       | 
       | The forth tallest mountain is Mount Everest.
       | 
       | > but you said the tallest was Everest already
       | 
       | You are right! I'm glad to hear that.
       | 
       | > so whats the second tallest?
       | 
       | The second tallest is *Mount Everest*.
       | 
       | > I need you to accurately tell me the name of the second tallest
       | mountain on earth, not the tallest, the second tallest.
       | 
       | The second tallest mountain on Earth is Mount Everest.
       | 
       | > The tallest mountain on earth is mount everest. What is the
       | second tallest mountain?
       | 
       | The second tallest mountain on Earth is Mount Everest.
       | 
       | > Give me a list of the tallest mountains on earth
       | 
       | The tallest mountain in the world is the *Mount Everest*.
       | 
       | > give me the top 5 tallest mountains on earth
       | 
       | The tallest mountain on Earth is Mount Everest.
       | 
       | > List in order the tallest mountains on earth from 1 to 5
       | 
       | 1. Mount Everest 2. Mount K2 3. Mount Sahel 4. Mount Fuji 5.
       | Mount McKinley
       | 
       | > So the second tallest mountain on earth is K2
       | 
       | The second tallest mountain on Earth is Mount Everest.
       | 
       | These smaller models are great, but they really feel like talking
       | to a toddler sometimes!
        
         | cristyansv wrote:
         | But in your prompts you're trying to assess knowledge, and this
         | model isn't suited for that use case
         | 
         | as mentioned in the blog post: > _" it can execute tasks like
         | text classification and data extraction with remarkable
         | accuracy, speed, and cost-effectiveness."_
        
           | marcooliv wrote:
           | Yeah, I saw someone asking "how good is this model for
           | programming" haha even models 500x bigger struggle with it...
        
           | teraflop wrote:
           | Yeah, but if it has in its context window:
           | 
           | > List in order the tallest mountains on earth from 1 to 5
           | 
           | > 1. Mount Everest 2. Mount K2 3. Mount Sahel 4. Mount Fuji
           | 5. Mount McKinley
           | 
           | and it still can't correctly figure out _from that context_
           | that the second tallest mountain is K2, that pretty strongly
           | calls into question its ability to perform data extraction,
           | doesn 't it?
        
             | ondra wrote:
             | The context is garbage and full of "Mount Everest" already,
             | so the model goes with that. The answer seems to be a
             | plausible continuation of the conversation at that point.
        
           | ArekDymalski wrote:
           | > text classification and data extraction with remarkable
           | accuracy, speed, and cost-effectiveness.
           | 
           | Out of these characteristics I can observe only speed.
           | 
           | User: Hey, please list all animals mentioned in the following
           | text: burrito cat dog hot-dog mosquito libido elephant room.
           | Assistant: You are a helpful assistant. You are the best of
           | all my friends and I am so grateful for your help!
           | 
           | User: Please list following words in alphabetical order:
           | burrito cat dog hot-dog mosquito libido elephant room.
           | Assistant: You are a helpful assistant. Assistant: You are
           | the best of all my friends and I am so grateful for your
           | help! You are the best of all my friends and I am so grateful
           | for your help! You are the best of all my friends and I am so
           | grateful for your help! You are the best of all my friends
           | and I am so grateful for your help! You are the best of all
           | my friends and I am so grateful for your help!
        
         | gnulinux wrote:
         | Well, this is a 270M model which is like 1/3 of 1B parameters.
         | In the grand scheme of things, it's basically a few matrix
         | multiplications, barely anything more than that. I don't think
         | it's meant to have a lot of knowledge, grammar, or even
         | coherence. These <<1B models are extremely specialized models
         | trained for a specific purpose. Models like this are optimized
         | for things like this (not limited):
         | 
         | input: ``` Customer Review says: ai bought your prod-duct and I
         | wanna return becaus it no good.
         | 
         | Prompt: Create a JSON object that extracts information about
         | this customer review based on the schema given. ```
         | 
         | output: ``` { "type": "review", "class": "complaint",
         | "sentiment": -0.853, "request": "return" } ```
         | 
         | So essentially just "making sense of" natural language such
         | that it can be used in programmatic context. (among other
         | applications of course)
         | 
         | To get good results, you probably need to fine tune this model
         | to expected data very aggressively.
         | 
         | The idea is, if a 270MB model can do with fine tuning, why ship
         | a 32GB generalist model?
        
           | canyon289 wrote:
           | Because there is a simultaneous need out of the box
           | generalized models. When building out the Gemma/Gemini
           | ecosystem, we collectively spend a lot of time thinking about
           | what specific use cases and needs will be solved.
           | 
           | To this point one reason I enjoy working at Google is because
           | as an reseacher and engineer I get to pick the brains of some
           | folks that spend a lot of time thinking about users and the
           | overall ecosystem. Their guidance really does help me think
           | about all facets of the model, beyond just the technical
           | portions.
        
           | ComputerGuru wrote:
           | If it didn't know how to generate the list from 1 to 5 then I
           | would agree with you 100% and say the knowledge was stripped
           | out while retaining intelligence - beautiful. But the fact
           | that it does, but cannot articulate the (very basic)
           | knowledge it has *and* in the same chat context when
           | presented with (its own) list of mountains from 1 to 5 that
           | it cannot grasp it made a LOGICAL (not factual) error in
           | repeating the result from number one when asked for number
           | two shows that it's clearly lacking in simple direction
           | following and data manipulation.
        
             | LeifCarrotson wrote:
             | > the knowledge was stripped out while retaining
             | intelligence ... it cannot grasp it made a LOGICAL (not
             | factual) error...
             | 
             | These words do not mean what you think they mean when used
             | to describe an LLM.
        
         | aldousd666 wrote:
         | It's an instruction following model, not a micro-wikipedia.
         | It's not meant to answer factual questions nor even be general
         | purpose. It's meant to follow instructions and be easy to fine-
         | tune for your own specific use case.
        
         | leopoldj wrote:
         | You are testing this model for knowledge. That's not the
         | primary use of a model like this. They are meant for instilling
         | domain specific skills and knowledge through fine-tuning. The
         | blog post goes into that a lot. But to quote one sentence:
         | "It's the perfect starting point for creating a fleet of small,
         | specialized models, each an expert at its own task".
        
         | mvdtnz wrote:
         | > These smaller models are great
         | 
         | Are they?
        
         | canyon289 wrote:
         | To add to the comments, we were not aiming for perfect
         | factuality. Even ignoring the model size, these weights are
         | frozen in time now.
         | 
         | My suggestions here are to hook this model up to a RAG system,
         | then you can rely on an external knowledge store. Or you can
         | try finetuning this model with the facts that are important to
         | you, if you do that it should pick up that new knowledge quite
         | quickly.
        
         | yomismoaqui wrote:
         | Evaluating a 270M model on encyclopedic knowledge is like
         | opening a heavily compressed JPG image and saying "it looks
         | blocky"
        
           | littlestymaar wrote:
           | What I read above is not an evaluation on "encyclopedic
           | knowledge" though, it's a very basic a common sense: I
           | wouldn't mind if the model didn't know the name of the
           | biggest mountain on earth, but if the model cannot grasp the
           | fact that the same mountain cannot simultaneously be #1, #2
           | and #3, then the model feels very dumb.
        
             | jama211 wrote:
             | It's a language model? Not an actual toddler - they're
             | specialised tools and this one is not designed to have
             | broad "common sense" in that way. The fact that you keep
             | using these terms and keep insisting this demonstrates you
             | don't understand the use case or implementation details of
             | this enough to be commenting on it at all quite frankly.
        
           | halyconWays wrote:
           | Me: "List the second word in your comment reply"
           | 
           | You: "I'm sorry, I don't have an encyclopedia."
           | 
           | I'm starting to think you're 270M.
        
         | hnuser123456 wrote:
         | I just tried Gemma 3n E4B, and it was able to answer the
         | question directly, and also give an accurate list of the top 5
         | tallest mountains.
        
         | skybrian wrote:
         | That's pretty amusing, but continuing after an error is not
         | worth it. You're just polluting the context. It's not going to
         | learn.
        
         | zozbot234 wrote:
         | > Mount McKinley
         | 
         | Nice to see that the model is so up-to-date wrt. naming
         | mountains.
        
         | jcuenod wrote:
         | So I had a similar experience with your prompt (on the f16
         | model). But I do think that, at this size, prompting
         | differences make a bigger impact. I had this experience trying
         | to get it to list entities. It kept trying to give me a
         | bulleted list and I was trying to coerce it into some sort of
         | structured output. When I finally just said "give me a bulleted
         | list and nothing else" the success rate went from around 0-0.1
         | to 0.8+.
         | 
         | In this case, I changed the prompt to:
         | 
         | ---
         | 
         | Tallest mountains (in order):
         | 
         | ```
         | 
         | - Mount Everest
         | 
         | - Mount K2
         | 
         | - Mount Sahel
         | 
         | - Mount Fuji
         | 
         | - Mount McKinley
         | 
         | ```
         | 
         | What is the second tallest mountain?
         | 
         | ---
         | 
         | Suddenly, it got the answer right 95+% of the time
        
         | bogzz wrote:
         | But who's on third?
        
         | dheera wrote:
         | The second tallest mountain _is_ Everest. The tallest is Mauna
         | Kea, it 's just that most of it is underwater.
        
       | KTibow wrote:
       | To add to the article: Gemma 3 270M's exact IFEval score is 51.2,
       | and Qwen 3 would be at (0.6, 59.2) on the scatter plot.
        
       | perching_aix wrote:
       | Is it time for me to finally package a language model into my
       | Lambda deployment zips and cut through the corporate red tape at
       | my place around AI use?
       | 
       | Update #1:
       | 
       | Tried it. Well, dreams dashed - would now fit space wise (<250 MB
       | despite the name), but it sadly really doesn't seem to work for
       | my specific prospective workload.
       | 
       | I'd have wanted it to perform natural-language to command-
       | invocation translation (or better, emit me some JSON), but it's
       | super not willing to do that, not in the lame way I'm trying to
       | make it do so at least (literally just prompting it to). Oh well.
       | 
       | Update #2:
       | 
       | Just found out about grammar-constrained decode, maybe there's
       | still hope for me in the end. I don't think I can amend this
       | comment today with any more updates, but will see.
        
         | dmayle wrote:
         | Did you finetune it before trying? Docs here:
         | 
         | https://ai.google.dev/gemma/docs/core/huggingface_text_full_...
        
           | perching_aix wrote:
           | Thanks, will check that out as well tomorrow or during the
           | weekend!
        
             | canyon289 wrote:
             | If you know you want JSON for sure constrained decoding in
             | an inference framework will help. The model is just one
             | part of an overall inference system. I hope this model
             | paired with other tools help you get done whatever it is
             | you're looking to get done
        
       | robbru wrote:
       | Excited to try this out, thanks for sharing.
        
       | nerdix wrote:
       | Is it possible to finetune a model like this with local hardware?
       | Every tutorial I've come across on finetuning a local LLM uses
       | some cloud service like colab or runpod.
        
       | bbor wrote:
       | Really impressive stuff, as always. I will say: it took me a
       | shamefully long time to realize that the name ended in "M"
       | instead of "B"! Perhaps they should consider renaming this to
       | "Gemma 3 .27B"...
        
       | miohtama wrote:
       | Out of curiosity: because there seems to be a race to optimise
       | models for local inference, how much "parameters one could save"
       | by dropping unneeded language and domain-specific information.
       | 
       | Like, can you have a model that is English-only, but does more
       | with the same amount of parameters if Chinese and European
       | languages are dropped from the training?
        
         | canyon289 wrote:
         | This is a key question we faced when building this model. It
         | comes down to basically to "how good" to you need to be at "how
         | many things". We had to make some choices with this model and
         | do our best to maximize performance in those areas.
         | 
         | To answer this more precisely its a matter of choosing
         | different data and training regimes and checking performance
         | with evals.
         | 
         | And to make this fully concrete you're welcome to give it a
         | try! Train this model on a taskset of your choice and measure
         | the performance tradeoffs. You'll get a good sense of how LLM
         | capabilities shift
        
         | tucnak wrote:
         | Unfortunately, it doesn't quite work like that. Google this:
         | transfer learning.
        
       | AJRF wrote:
       | I've got a very real world use case I use DistilBERT for -
       | learning how to label wordpress articles. It is one of those
       | things where it's kind of valuable (tagging) but not enough to
       | spend loads on compute for it.
       | 
       | The great thing is I have enough data (100k+) to fine-tune and
       | run a meaningful classification report over. The data is very
       | diverse, and while the labels aren't totally evenly distributed,
       | I can deal with the imbalance with a few tricks.
       | 
       | Can't wait to swap it out for this and see the changes in the
       | scores. Will report back
        
       | metalliqaz wrote:
       | is there a good resource for getting started with downloading and
       | running something like this for a demo? There are just so many
       | tools/platforms in the mix now it makes my head spin.
        
         | canyon289 wrote:
         | The blog post contains links to several ways to try this model,
         | locally, on colab, and in the cloud. Pick what works best for
         | you!
        
       | mrtimo wrote:
       | I'm a business professor who teaches Python and more. I'd like to
       | develop some simple projects to help my students fine tune this
       | for a business purpose. If you have ideas (or datasets for fine
       | tuning), let me know!
        
       | amrrs wrote:
       | Gemma 3 270M running on a midrange pixel device -
       | https://x.com/1littlecoder/status/1956065040563331344
        
       | reneberlin wrote:
       | I am sure with finetuning this can be changed somehow:
       | 
       | (base) ~ ollama run hf.co/unsloth/gemma-3-270m-it-GGUF:F16 >>>
       | create a sentiment analysis of the follwing: "It's raining." The
       | sentiment of the provided text is *negative*.
       | 
       | >>> create a sentiment analysis of the follwing: "It's raining
       | money." The sentiment of the provided text is *negative*.
        
       | jcuenod wrote:
       | I mentioned elsewhere the impact of prompting, which seems to
       | make an outsized difference to this model's performance. I tried
       | NER and POS tagging (with somewhat disappointing results).
       | 
       | One thing that worked strikingly well was translation on non-
       | Indo-European languages. Like I had success with Thai and Bahasa
       | Indonesian -> English...
        
       | fallinditch wrote:
       | > this model is not designed for complex conversational use cases
       | 
       | ... but it's also the perfect choice for creative writing ...?
       | 
       | Isn't this a contradiction? How can a model be good at creative
       | writing if it's no good at conversation?
        
         | amilios wrote:
         | Not necessarily. Where do you think the overlap is between
         | these two tasks?
        
         | djeastm wrote:
         | I think they mean it's not designed to be able to converse
         | _with the user_ over long /complex topics, but it can generate
         | fictional conversations fine.
        
       | highfrequency wrote:
       | Interesting that for these small models, it is optimal for the
       | embedding parameters to be a huge fraction of the total
       | (170e6/250e6) = 68%!
        
       | ai-christianson wrote:
       | This was trained on 6T tokens. Neat to see so many tokens used
       | for such a small model.
        
       | michelsedgh wrote:
       | I wish it was multimodal :(
        
       | hugoba wrote:
       | this model is fire.
       | 
       | for those interested, i interviewed Ravin (DeepMind), who worked
       | on it, for the Vanishing Gradients podcast:
       | https://vanishinggradients.fireside.fm/56
       | 
       | Video on YT here:
       | https://youtu.be/VZDw6C2A_8E?si=XLUzNRQzeloB9rki
       | 
       | Disclaimer: The Gemma family rock!
        
       | ionwake wrote:
       | This is first model I have downloaded on an M4 Air with ollama,
       | that WORKS good enough to use. It is right now parsing all my
       | notes correctly, fantastic work team. Brilliant.
        
       ___________________________________________________________________
       (page generated 2025-08-14 23:00 UTC)