https://github.com/huggingface/smollm

Skip to content

Navigation Menu

Toggle navigation
 
Sign in

  * Product
      +  
        GitHub Copilot
        Write better code with AI
      +  
        Security
        Find and fix vulnerabilities
      +  
        Actions
        Automate any workflow
      +  
        Codespaces
        Instant dev environments
      +  
        Issues
        Plan and track work
      +  
        Code Review
        Manage code changes
      +  
        Discussions
        Collaborate outside of code
      +  
        Code Search
        Find more, search less
    Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
    By company size
      + Enterprises
      + Small and medium teams
      + Startups
    By use case
      + DevSecOps
      + DevOps
      + CI/CD
      + View all use cases
    By industry
      + Healthcare
      + Financial services
      + Manufacturing
      + Government
      + View all industries
    View all solutions
  * Resources
    Topics
      + AI
      + DevOps
      + Security
      + Software Development
      + View all
    Explore
      + Learning Pathways
      + White papers, Ebooks, Webinars
      + Customer Stories
      + Partners
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
    Repositories
      + Topics
      + Trending
      + Collections
  * Enterprise
      +  
        Enterprise platform
        AI-powered developer platform
    Available add-ons
      +  
        Advanced Security
        Enterprise-grade security features
      +  
        GitHub Copilot
        Enterprise-grade AI features
      +  
        Premium Support
        Enterprise-grade 24/7 support
  * Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search
[                    ]
Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

[                    ] [ ] Include my email address so I can be
contacted
Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name [                    ] 
Query [                    ]

To see all available qualifiers, see our documentation.

Cancel Create saved search
Sign in
Sign up Reseting focus
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session. You switched accounts on another tab or window. Reload
to refresh your session. Dismiss alert
{{ message }}
huggingface / smollm Public

  * Notifications You must be signed in to change notification
    settings
  * Fork 26
  * Star 597

Everything about the SmolLM & SmolLM2 family of models

huggingface.co/huggingfacetb

License

Apache-2.0 license
597 stars 26 forks Branches Tags Activity
Star
Notifications You must be signed in to change notification settings

  * Code
  * Issues 2
  * Pull requests 0
  * Actions
  * Projects 0
  * Security
  * Insights

Additional navigation options

  * Code
  * Issues
  * Pull requests
  * Actions
  * Projects
  * Security
  * Insights

huggingface/smollm

 main
BranchesTags
  
[                    ]
Go to file
Code

Folders and files

        Name                  Name          Last commit   Last commit
                                              message        date
Latest commit

 

History

25 Commits
 
distilabel_pipelines  distilabel_pipelines                 
evaluation            evaluation                           
finetuning            finetuning                           
pre-training          pre-training                         
smol_tools            smol_tools                           
.gitignore            .gitignore                           
LICENSE               LICENSE                              
README.md             README.md                            
View all files

Repository files navigation

  * README
  * Apache-2.0 license

SmolLM2

 

SmolLM2 is a family of compact language models available in three
size: 135M, 360M, and 1.7B parameters. They are capable of solving a
wide range of tasks while being lightweight enough to run on-device.

You can find our most capable model  SmolLM2-1.7B-Instruct here.

New: Introducing SmolTalk, the SFT dataset of SmolLM2 

Evaluation Results

Table of Contents

 

 1. Usage
      + Transformers
      + Chat in TRL
      + Local applications
 2. Smol-tools
 3. Pre-training
 4. Fine-tuning
 5. Evaluation
 6. Synthetic data pipelines

Usage

 

Our most powerful model is SmolLM2-1.7B-Instruct, which you can use
as an assistant with transformers, trl, or using quantized versions
with tools like llama.cpp, MLX, and transformers.js. For lighter
applications, you can also use the smaller models SmolLM2-360M
andSmolLM2-135M, which are suitable for on-device usage and can be
integrated similarly. All available in this collection.

Transformers

 

pip install transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "HuggingFaceTB/SmolLM2-1.7B-Instruct"

device = "cuda" # for GPU usage or "cpu" for CPU usage
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# for multiple GPUs install accelerate and do `model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto")`
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

messages = [{"role": "user", "content": "Write a 100-word article on 'Benefits of Open-Source in AI research"}]
input_text=tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(inputs, max_new_tokens=50, temperature=0.2, top_p=0.9, do_sample=True)
print(tokenizer.decode(outputs[0]))

Chat in TRL

 

You can also use the TRL CLI to chat with the model from the
terminal:

pip install trl
trl chat --model_name_or_path HuggingFaceTB/SmolLM2-1.7B-Instruct --device cpu

You can find more details on how to leverage the model for use cases
such as text summarization, text rewriting and function calling in
the model card: https://huggingface.co/HuggingFaceTB/
SmolLM2-1.7B-Instruct

Local applications

 

You can use the models locally with frameworks like llama.cpp, MLX,
and transformers.js, which support SmolLM2. All models are available
in this collection.

Smol-tools

 

A collection of lightweight AI-powered tools built with LLaMA.cpp and
small language models. These tools are designed to run locally on
your machine without requiring expensive GPU resources. Further
instructions on how to use the tools can be found in the smol-tools
README.

Pre-training

 

You can find scripts for launching pre-training with nanotron under
pre-training, we share the exact configs for training SmolLM1 and
will upload SmolLM2's configs soon.

Fine-tuning

 

You can find an example script to finetune SmolLM2 using TRL and PEFT
in the finetuning folder. We also link to our post-training scripts
for SmolLM2 using the alignement handbook.

Evaluation

 

image/png

You can find more detailed evaluation of each model size in the model
cards in this collection. We use lighteval for all our evaluations,
for more details refer to the evaluation README.

Synthetic data pipelines

 

We released SmolTalk the SFT dataset used for building SmolLM2
instruct models. It was created with distilabel and you can check and
execute the synthetic data pipelines in distilabel_pipelines README

[68747470733a2f2f63646e2d75706c6f6164732e68756767696e67666163652e636f2f70726f64756374696f6e2f75706c6f6164732f36316]

Comparison of models finetuned on SmolTalk and Orca AgentInstruct 1M.
            For more details, refer to the dataset card.

About

Everything about the SmolLM & SmolLM2 family of models

huggingface.co/HuggingFaceTB

Resources

Readme

License

Apache-2.0 license
Activity
Custom properties

Stars

597 stars

Watchers

11 watching

Forks

26 forks
Report repository

Releases

No releases published

Packages 0

No packages published

Contributors 4

  * @loubnabnl loubnabnl Loubna Ben Allal
  * @andimarafioti andimarafioti Andres Marafioti
  * @anton-l anton-l Anton Lozhkov
  * @gabrielmbmb gabrielmbmb Gabriel Martin Blazquez

Languages

  * Python 100.0%

Footer

 (c) 2024 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact
  * Manage cookies
  * Do not share my personal information

You can't perform that action at this time.