https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-language-model-built-for-the-public-good.html

  * Homepage
  * Navigation
  * Search
  * Content
  * Footer
  * Contact
  * Sitemap

 

Header

---------------------------------------------------------------------

Services

  * Student portal
  * Alumni association
  * Staffnet
  * Contact
  * Login

Search

 
en [EN     ]

Departments

  * ETH Zurich
  * Select a department 
    [Departments                                              ]

Language Selection

  * en [English      ]

You are here

  * Homepage chevron_right
  * News & events chevron_right
  * ETH News chevron_right
  * All articles chevron_right
  * 2025 chevron_right
  * July chevron_right
  * A language model built for the public good

A language model built for the public good

  * Machine learning
  * Innovation & Industry

ETH Zurich and EPFL will release a large language model (LLM)
developed on public infrastructure. Trained on the "Alps"
supercomputer at the Swiss National Supercomputing Centre (CSCS), the
new LLM marks a milestone in open-source AI and multilingual
excellence.

09.07.2025 by Florian Meyer, Corporate Communications and Melissa
Anchisi, Head of AI Communication EPFL

  * volume_upRead
  * mode_comment Number of comments
  * 

An illustration of a swiss cross. The cross consists of cables, one
side is red and the other blue.
Researchers from EPFL, ETH Zurich, and CSCS have developed a fully
open large language model from scratch, which is set to be released
in late summer 2025.  (Image: AI-generated)

In brief

  * In late summer 2025, a publicly developed large language model
    (LLM) will be released -- co-created by researchers at EPFL, ETH
    Zurich, and the Swiss National Supercomputing Centre (CSCS).
  * This LLM will be fully open: This openness is designed to support
    broad adoption and foster innovation across science, society, and
    industry. 
  * A defining feature of the model is its multilingual fluency in
    over 1,000 languages.

Earlier this week in Geneva, around 50 leading global initiatives and
organisations dedicated to open-source LLMs and trustworthy AI
convened at the International Open-Source LLM Builders Summit. Hosted
by the AI centres of EPFL and ETH Zurich, the event marked a
significant step in building a vibrant and collaborative
international ecosystem for open foundation models. Open LLMs are
increasingly viewed as credible alternatives to commercial systems,
most of which are developed behind closed doors in the United States
or China.

Participants of the summit previewed the forthcoming release of a
fully open, publicly developed LLM -- co-created by researchers at
EPFL, ETH Zurich and other Swiss universities in close collaboration
with engineers at CSCS. Currently in final testing, the model will be
downloadable under an open license. The model focuses on
transparency, multilingual performance, and broad accessibility.

The model will be fully open: source code and weights will be
publicly available, and the training data will be transparent and
reproducible, supporting adoption across science, government,
education, and the private sector. This approach is designed to
foster both innovation and accountability.

"Fully open models enable high-trust applications and are necessary
for advancing research about the risks and opportunities of AI.
Transparent processes also enable regulatory compliance," says Imanol
Schlag, research scientist at the ETH AI Center, who is leading the
effort alongside EPFL AI Center faculty members and professors
Antoine Bosselut and Martin Jaggi.

Multilingual by design

A defining characteristic of the LLM is its fluency in over 1000
languages. "We have emphasised making the models massively
multilingual from the start," says Antoine Bosselut.

Training of the base model was done on a large text dataset in over
1500 languages -- approximately 60% English and 40% non-English
languages -- as well as code and mathematics data. Given the
representation of content from all languages and cultures, the
resulting model maintains the highest global applicability.

Designed for scale and inclusion

The model will be released in two sizes -- 8 billion and 70 billion
parameters, meeting a broad range of users' needs. The 70B version
will rank among the most powerful fully open models worldwide. The
number of parameters reflects a model's capacity to learn and
generate complex responses.

High reliability is achieved through training on over 15 trillion
high-quality training tokens (units representing a word or part of
the word), enabling robust language understanding and versatile use
cases.

Responsible data practices

The LLM is being developed with due consideration to Swiss data
protection laws, Swiss copyright laws, and the transparency
obligations under the EU AI Act. In a external page recent study, the
project leaders demonstrated that for most everyday tasks and general
knowledge acquisition, respecting web crawling opt-outs during data
acquisition produces virtually no performance degradation.

Supercomputer as an enabler of sovereign AI

The model is trained on the "Alps" supercomputer at CSCS in Lugano,
one of the world's most advanced AI platforms, equipped with over
10,000 NVIDIA Grace Hopper Superchips. The system's scale and
architecture made it possible to train the model efficiently using
100% carbon-neutral electricity.

The successful realisation of "Alps" was significantly facilitated by
a long-standing collaboration spanning over 15 years with NVDIA and
HPE/Cray. This partnership has been pivotal in shaping the
capabilities of "Alps", ensuring it meets the demanding requirements
of large-scale AI workloads, including the pre-training of complex
LLMs.

"Training this model is only possible because of our strategic
investment in 'Alps', a supercomputer purpose-built for AI," says
Thomas Schulthess, Director of CSCS and professor at ETH Zurich. "Our
enduring collaboration with NVIDIA and HPE exemplifies how joint
efforts between public research institutions and industry leaders can
drive sovereign infrastructure, fostering open innovation -- not just
for Switzerland, but for science and society worldwide."

Public access and global reuse

In late summer, the LLM will be released under the Apache 2.0
License. Accompanying documentation will detail the model
architecture, training methods, and usage guidelines to enable
transparent reuse and further development.

"As scientists from public institutions, we aim to advance open
models and enable organiations to build on them for their own
applications", says Antoine Bosselut.

"By embracing full openness -- unlike commercial models that are
developed behind closed doors -- we hope that our approach will drive
innovation in Switzerland, across Europe, and through multinational
collaborations. Furthermore, it is a key factor in attracting and
nurturing top talent," says EPFL professor Martin Jaggi.

About the Swiss AI Initiative

Launched in December 2023 by EPFL and ETH Zurich, the external page 
Swiss AI Initiative is supported by more than 10 academic
institutions across Switzerland. With over 800 researchers involved
and access to over 20 million yearly GPU hours on CSCS's
supercomputer "Alps", it stands as the world's largest open science
and open source effort dedicated to AI foundation models.

The Swiss AI Initiative is receiving financial support from the ETH
Board -- the strategic management and supervisory body of the ETH
Domain (ETH, EPFL, PSI, WSL, Empa, Eawag) -- for the period 2025 to
2028.

About ELLIS

The Swiss AI Initiative is led by researchers from the ETH AI Center
and the EPFL AI Center, both of which serve as regional units of
ELLIS (the European Laboratory for Learning and Intelligent Systems)
-- a pan-European AI network focused on fundamental research in
trustworthy AI, technical innovation, and societal impact within
Europe's open societies.

About CSCS

The Swiss National Supercomputing Centre (CSCS) is a member and
partner of the LUMI Consortium, granting Swiss scientist access to
leading infrastructure in Kajaani, Finland. This aligns with CSCS'
strategy to scale out future, significantly larger extreme-scale
computing infrastructures through multi-national collaborations,
leveraging regions abundant in hydroelectric and cooling resources,
positioning AI research and innovation to ensure global relevance and
regional impact.

Newsletter subscription

Get the latest ETH News everyday

Similar topics

  * Innovation & Industry
  * Cooperations
  * International
  * Machine learning
  * Supercomputing
  * Computer and information technology

Footer

Recommended links

  * Media information

Search

Keyword or person [                    ] 
Follow us

  * Subscribe to the ETH Zurich - Homepage's newsfeed
  * ETH Zurich - Homepage on LinkedIn
  * ETH Zurich - Homepage on Instagram
  * ETH Zurich - Homepage on Facebook
  * ETH Zurich - Homepage on YouTube
  * ETH Zurich - Homepage on BlueSky
  * ETH Zurich - Homepage on TikTok

Services

  * Student portal
  * Alumni association
  * Staffnet
  * Contact
  * Login

Departments

  * D-ARCH Architecture
  * D-BAUG Civil, Environmental and Geomatic Engineering
  * D-BIOL Biology
  * D-BSSE Biosystems Science and Engineering
  * D-CHAB Chemistry and Applied Biosciences
  * D-EAPS Earth and Planetary Sciences
  * D-GESS Humanities, Social and Political Sciences
  * D-HEST Health Sciences and Technology
  * D-INFK Computer Science
  * D-ITET Information Technology and Electrical Engineering
  * D-MATH Mathematics
  * D-MATL Department of Materials
  * D-MAVT Mechanical and Process Engineering
  * D-MTEC Management, Technology and Economics
  * D-PHYS Physics
  * D-USYS Environmental Systems Science

Table of contents and legal

  * Sitemap
  * Imprint
  * Accessibility Statement
  * Disclaimer & Copyright
  * Data protection

(c) 2025  Eidgenossische Technische Hochschule Zurich
JavaScript has been disabled in your browser