[HN Gopher] Show HN: Llama Running on a Microcontroller
       ___________________________________________________________________
        
       Show HN: Llama Running on a Microcontroller
        
       Author : maxbbraun
       Score  : 28 points
       Date   : 2023-11-15 04:34 UTC (18 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | RecycledEle wrote:
       | The "microcontroller" is a Coral AI accelerator.
        
         | maxbbraun wrote:
         | Just to clarify: Inference is happening on the Arm Cortex-M7.
         | The Coral TPU chip is off in this implementation.
        
       | AMICABoard wrote:
       | epic!
        
       | maxbbraun wrote:
       | I was wondering if it's possible to fit a non-trivial language
       | model on a microcontroller. Turns out the answer is some version
       | of yes!
       | 
       | This project is using the Coral Dev Board Micro with its FreeRTOS
       | toolchain. The board has a number of neat hardware features not
       | currently being used here (notably a TPU, sensors, and a second
       | CPU core). It does, however, also have 64MB of RAM. That's tiny
       | for LLMs, which are typically measured in the GBs, but
       | comparatively huge for a microcontroller.
       | 
       | The LLM implementation itself is an adaptation of llama2.c and
       | the tinyllamas checkpoints trained on the TinyStories dataset.
       | The quality of the smaller model versions isn't ideal, but good
       | enough to generate somewhat coherent (and occasionally weird)
       | stories.
        
       ___________________________________________________________________
       (page generated 2023-11-15 23:02 UTC)