https://www.nature.com/articles/d41586-024-02998-y Skip to main content Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript. Advertisement Advertisement Nature * View all journals * Search * Log in * Explore content * About the journal * Publish with us * Subscribe * Sign up for alerts * RSS feed 1. nature 2. technology features 3. article Forget ChatGPT: why researchers now run small AIs on their laptops Download PDF Download PDF * TECHNOLOGY FEATURE * 16 September 2024 Forget ChatGPT: why researchers now run small AIs on their laptops Artificial-intelligence models are typically used online, but a host of openly available tools is changing that. Here's how to get started with local AIs. By * Matthew Hutson^0 1. Matthew Hutson 1. Matthew Hutson is a science writer based in New York City. View author publications You can also search for this author in PubMed Google Scholar * Twitter * Facebook * Email Cartoon showing a hand holding a small, pixelated head with pixelated speech bubbles coming from it. Illustration: The Project Twins The website histo.fyi is a database of structures of immune-system proteins called major histocompatibility complex (MHC) molecules. It includes images, data tables and amino-acid sequences, and is run by bioinformatician Chris Thorpe, who uses artificial intelligence (AI) tools called large language models (LLMs) to convert those assets into readable summaries. But he doesn't use ChatGPT, or any other web-based LLM. Instead, Thorpe runs the AI on his laptop. [d41586-024] Chatbots in science: What can ChatGPT do for you? Over the past couple of years, chatbots based on LLMs have won praise for their ability to write poetry or engage in conversations. Some LLMs have hundreds of billions of parameters -- the more parameters, the greater the complexity -- and can be accessed only online. But two more recent trends have blossomed. First, organizations are making 'open weights' versions of LLMs, in which the weights and biases used to train a model are publicly available, so that users can download and run them locally, if they have the computing power. Second, technology firms are making scaled-down versions that can be run on consumer hardware -- and that rival the performance of older, larger models. Researchers might use such tools to save money, protect the confidentiality of patients or corporations, or ensure reproducibility. Thorpe, who's based in Oxford, UK, and works at the European Molecular Biology Laboratory's European Bioinformatics Institute in Hinxton, UK, is just one of many researchers exploring what the tools can do. That trend is likely to grow, Thorpe says. As computers get faster and models become more efficient, people will increasingly have AIs running on their laptops or mobile devices for all but the most intensive needs. Scientists will finally have AI assistants at their fingertips -- but the actual algorithms, not just remote access to them. Big things in small packages Several large tech firms and research institutes have released small and open-weights models over the past few years, including Google DeepMind in London; Meta in Menlo Park, California; and the Allen Institute for Artificial Intelligence in Seattle, Washington (see 'Some small open-weights models'). ('Small' is relative -- these models can contain some 30 billion parameters, which is large by comparison with earlier models.) Some small open-weights models Developer Model Parameters Allen Institute for AI OLMo-7B 7 billion Alibaba Qwen2-0.5B 0.5 billion Apple DCLM-Baseline-7B 7 billion Google DeepMind Gemma-2-9B 9 billion Google DeepMind CodeGemma-7B 7 billion Meta Llama 3.1-8B 8 billion Microsoft Phi-3-medium-128K-Instruct 14 billion Mistral AI Mistral-Nemo-Base-2407 12 billion Although the California tech firm OpenAI hasn't open-weighted its current GPT models, its partner Microsoft in Redmond, Washington, has been on a spree, releasing the small language models Phi-1, Phi-1.5 and Phi-2 in 2023, then four versions of Phi-3 and three versions of Phi-3.5 this year. The Phi-3 and Phi-3.5 models have between 3.8 billion and 14 billion active parameters, and two models (Phi-3-vision and Phi-3.5-vision) handle images^1. By some benchmarks, even the smallest Phi model outperforms OpenAI's GPT-3.5 Turbo from 2023, rumoured to have 20 billion parameters. Sebastien Bubeck, Microsoft's vice-president for generative AI, attributes Phi-3's performance to its training data set. LLMs initially train by predicting the next 'token' (iota of text) in long text strings. To predict the name of the killer at the end of a murder mystery, for instance, an AI needs to 'understand' everything that came before, but such consequential predictions are rare in most text. To get around this problem, Microsoft used LLMs to write millions of short stories and textbooks in which one thing builds on another. The result of training on this text, Bubeck says, is a model that fits on a mobile phone but has the power of the initial 2022 version of ChatGPT. "If you are able to craft a data set that is very rich in those reasoning tokens, then the signal will be much richer," he says. [d41586-024] ChatGPT for science: how to talk to your data Phi-3 can also help with routing -- deciding whether a query should go to a larger model. "That's a place where Phi-3 is going to shine," Bubeck says. Small models can also help scientists in remote regions that have little cloud connectivity. "Here in the Pacific Northwest, we have amazing places to hike, and sometimes I just don't have network," he says. "And maybe I want to take a picture of some flower and ask my AI some information about it." Researchers can build on these tools to create custom applications. The Chinese e-commerce site Alibaba, for instance, has built models called Qwen with 500 million to 72 billion parameters. A biomedical scientist in New Hampshire fine-tuned the largest Qwen model using scientific data to create Turbcat-72b, which is available on the model-sharing site Hugging Face. (The researcher goes only by the name Kal'tsit on the Discord messaging platform, because AI-assisted work in science is still controversial.) Kal'tsit says she created the model to help researchers to brainstorm, proof manuscripts, prototype code and summarize published papers; the model has been downloaded thousands of times. Preserving privacy Beyond the ability to fine-tune open models for focused applications, Kal'tsit says, another advantage of local models is privacy. Sending personally identifiable data to a commercial service could run foul of data-protection regulations. "If an audit were to happen and you show them you're using ChatGPT, the situation could become pretty nasty," she says. Cyril Zakka, a physician who leads the health team at Hugging Face, uses local models to generate training data for other models (which are sometimes local, too). In one project, he uses them to extract diagnoses from medical reports so that another model can learn to predict those diagnoses on the basis of echocardiograms, which are used to monitor heart disease. In another, he uses the models to generate questions and answers from medical textbooks to test other models. "We are paving the way towards fully autonomous surgery," he explains. A robot trained to answer questions would be able to communicate better with doctors. Zakka uses local models -- he prefers Mistral 7B, released by the tech firm Mistral AI in Paris, or Meta's Llama-3 70B -- because they're cheaper than subscription services such as ChatGPT Plus, and because he can fine-tune them. But privacy is also key, because he's not allowed to send patients' medical records to commercial AI services. [d41586-024] Inside the maths that drives AI Johnson Thomas, an endocrinologist at the health system Mercy in Springfield, Missouri, is likewise motivated by patient privacy. Clinicians rarely have time to transcribe and summarize patient interviews, but most commercial services that use AI to do so are either too expensive or not approved to handle private medical data. So, Thomas is developing an alternative. Based on Whisper -- an open-weight speech-recognition model from OpenAI -- and on Gemma 2 from Google DeepMind, the system will allow physicians to transcribe conversations and convert them to medical notes, and also summarize data from medical-research participants. Privacy is also a consideration in industry. CELLama, developed at the South Korean pharmaceutical company Portrai in Seoul, exploits local LLMs such as Llama 3.1 to reduce information about a cell's gene expression and other characteristics to a summary sentence^2. It then creates a numerical representation of this sentence, which can be used to cluster cells into types. The developers highlight privacy as one advantage on their GitHub page, noting that CELLama "operates locally, ensuring no data leaks". Putting models to good use As the LLM landscape evolves, scientists face a fast-changing menu of options. "I'm still at the tinkering, playing stage of using LLMs locally," Thorpe says. He tried ChatGPT, but felt it was expensive, and the tone of its output wasn't right. Now he uses Llama locally, with either 8 billion or 70 billion parameters, both of which can run on his Mac laptop. Another benefit, Thorpe says, is that local models don't change. Commercial developers, by contrast, can update their models at any moment, leading to different outputs and forcing Thorpe to alter his prompts or templates. "In most of science, you want things that are reproducible," he explains. "And it's always a worry if you're not in control of the reproducibility of what you're generating." For another project, Thorpe is writing code that aligns MHC molecules on the basis of their 3D structure. To develop and test his algorithms, he needs lots of diverse proteins -- more than exist naturally. To design plausible new proteins, he uses ProtGPT2, an open-weights model with 738 million parameters that was trained on about 50 million sequences^3. Sometimes, however, a local app won't do. For coding, Thorpe uses the cloud-based GitHub Copilot as a partner. "It kind of feels like my arm's chopped off when for some reason I can't actually use Copilot," he says. Local LLM-based coding tools do exist (such as Google DeepMind's CodeGemma and one from California-based developers Continue), but in his experience they can't compete with Copilot. Access points So, how do you run a local LLM? Software called Ollama (available for Mac, Windows and Linux operating systems) lets users download open models, including Llama 3.1, Phi-3, Mistral and Gemma 2, and access them through a command line. Other options include the cross-platform app GPT4All and Llamafile, which can transform LLMs into a single file that runs on any of six operating systems, with or without a graphics processing unit. [d41586-024] NatureTech hub Sharon Machlis, a former editor at the website InfoWorld, who lives in Framingham, Massachusetts, wrote a guide to using LLMs locally, covering a dozen options. "The first thing I would suggest," she says, "is to have the software you choose fit your level of how much you want to fiddle." Some people prefer the ease of apps, whereas others prefer the flexibility of the command line. Whichever approach you choose, local LLMs should soon be good enough for most applications, says Stephen Hood, who heads open-source AI at the tech firm Mozilla in San Francisco. "The rate of progress on those over the past year has been astounding," he says. As for what those applications might be, that's for users to decide. "Don't be afraid to get your hands dirty," Zakka says. "You might be pleasantly surprised by the results." Nature 633, 728-729 (2024) doi: https://doi.org/10.1038/d41586-024-02998-y References 1. Abdin, M. et al. Preprint at arXiv https://doi.org/10.48550/ arXiv.2404.14219 (2024). 2. Choi, H. et al. Preprint at bioRxiv https://doi.org/10.1101/ 2024.05.08.593094 (2024). 3. Ferruz, N. et al. Nature Commun. 13, 4348 (2022). Article PubMed Google Scholar Download references Related Articles * [d41586-024] Chatbots in science: What can ChatGPT do for you? * [d41586-024] ChatGPT for science: how to talk to your data * [d41586-024] Inside the maths that drives AI * [d41586-024] AI and science: what 1,600 researchers think * [d41586-024] NatureTech hub Subjects * Technology * Machine learning * Computer science Latest on: Technology AI's international research networks mapped AI's international research networks mapped Nature Index 18 SEP 24 Rise of ChatGPT and other tools raises major questions for research Rise of ChatGPT and other tools raises major questions for research Nature Index 18 SEP 24 Artificial intelligence laws in the US states are feeling the weight of corporate lobbying Artificial intelligence laws in the US states are feeling the weight of corporate lobbying Nature Index 18 SEP 24 Machine learning Do AI models produce more original ideas than researchers? Do AI models produce more original ideas than researchers? News 20 SEP 24 AI's international research networks mapped AI's international research networks mapped Nature Index 18 SEP 24 Rise of ChatGPT and other tools raises major questions for research Rise of ChatGPT and other tools raises major questions for research Nature Index 18 SEP 24 Computer science Do AI models produce more original ideas than researchers? Do AI models produce more original ideas than researchers? News 20 SEP 24 Guide, don't hide: reprogramming learning in the wake of AI Guide, don't hide: reprogramming learning in the wake of AI Career Guide 04 SEP 24 A day in the life of the world's fastest supercomputer A day in the life of the world's fastest supercomputer News Feature 04 SEP 24 Nature Careers Jobs * Assistant Professor in Molecular and Cellular Biophysics Vanderbilt University seeks an outstanding individual for a tenure-track faculty position in molecular and cellular biophysics. The candidate will ... Nashville, Tennessee Vanderbilt University - Department of Biological Sciences [] * Associate or Senior Editor (Ecology), Nature Ecology & Evolution Job Title: Associate or Senior Editor (Ecology), Nature Ecology & Evolution Location: New York, Jersey City, Philadelphia, Beijing or Shanghai - Hy... New York City, New York (US) Springer Nature Ltd [] * Faculty position, Department of Oncology - Division of Quality of Life and Palliative Care Memphis, Tennessee St. Jude Children's Research Hospital (St. Jude) [] * Principal Investigator Positions at the Chinese Institutes for Medical Research, Beijing Cancer Biology, Molecular and Cellular Therapeutics, Regenerative Medicine, Immunology and Infectious Diseases, Genetics and etc... Beijing, China The Chinese Institutes for Medical Research (CIMR), Beijing [] * Immunology PI positions - The Chinese Institutes for Medical Research CIMR is committed to building a world-class medical research hub and fostering a diverse and inclusive work environment. Beijing, China The Chinese Institutes for Medical Research (CIMR), Beijing [] Download PDF Related Articles * [d41586-024] Chatbots in science: What can ChatGPT do for you? * [d41586-024] ChatGPT for science: how to talk to your data * [d41586-024] Inside the maths that drives AI * [d41586-024] AI and science: what 1,600 researchers think * [d41586-024] NatureTech hub Subjects * Technology * Machine learning * Computer science Advertisement Sign up to Nature Briefing An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday. Email address [ ] [ ] Yes! Sign me up to receive the daily Nature Briefing email. I agree my information will be processed in accordance with the Nature and Springer Nature Limited Privacy Policy. Sign up * Close Nature Briefing Sign up for the Nature Briefing newsletter -- what matters in science, free to your inbox daily. Email address [ ] Sign up [ ] I agree my information will be processed in accordance with the Nature and Springer Nature Limited Privacy Policy. Close Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing Explore content * Research articles * News * Opinion * Research Analysis * Careers * Books & Culture * Podcasts * Videos * Current issue * Browse issues * Collections * Subjects * Follow us on Facebook * Follow us on Twitter * Subscribe * Sign up for alerts * RSS feed About the journal * Journal Staff * About the Editors * Journal Information * Our publishing models * Editorial Values Statement * Journal Metrics * Awards * Contact * Editorial policies * History of Nature * Send a news tip Publish with us * For Authors * For Referees * Language editing services * Submit manuscript Search Search articles by subject, keyword or author [ ] Show results from [All journals] Search Advanced search Quick links * Explore articles by subject * Find a job * Guide to authors * Editorial policies Nature (Nature) ISSN 1476-4687 (online) ISSN 0028-0836 (print) nature.com sitemap About Nature Portfolio * About us * Press releases * Press office * Contact us Discover content * Journals A-Z * Articles by subject * protocols.io * Nature Index Publishing policies * Nature portfolio policies * Open access Author & Researcher services * Reprints & permissions * Research data * Language editing * Scientific editing * Nature Masterclasses * Research Solutions Libraries & institutions * Librarian service & tools * Librarian portal * Open research * Recommend to library Advertising & partnerships * Advertising * Partnerships & Services * Media kits * Branded content Professional development * Nature Careers * Nature Conferences Regional websites * Nature Africa * Nature China * Nature India * Nature Italy * Nature Japan * Nature Middle East * Privacy Policy * Use of cookies * Your privacy choices/Manage cookies * Legal notice * Accessibility statement * Terms & Conditions * Your US state privacy rights Springer Nature (c) 2024 Springer Nature Limited