https://github.com/bigscience-workshop/petals Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code + Explore + All features + Documentation + GitHub Skills + Blog * Solutions + For + Enterprise + Teams + Startups + Education + By Solution + CI/CD & Automation + DevOps + DevSecOps + Case Studies + Customer Stories + Resources * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles + Repositories + Topics + Trending + Collections * Pricing [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this organization All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up {{ message }} bigscience-workshop / petals Public * Notifications * Fork 34 * Star 1.3k Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading petals.ml License MIT license 1.3k stars 34 forks Star Notifications * Code * Issues 9 * Pull requests 3 * Actions * Projects 0 * Wiki * Security * Insights More * Code * Issues * Pull requests * Actions * Projects * Wiki * Security * Insights bigscience-workshop/petals This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 100 branches 1 tag Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/b] Use Git or checkout with SVN using the web URL. [gh repo clone bigsci] Work fast with our official CLI. Learn more. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit @borzunov borzunov Add link to PyPI (#173) ... 779959b Dec 31, 2022 Add link to PyPI (#173) 779959b Git stats * 300 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .github/workflows Fix arguments in remove_old_models.py (#153) Dec 13, 2022 examples Fix misstypos in the example notebooks. (#161) Dec 16, 2022 src/petals Bump version to 1.0.0 Dec 30, 2022 tests Fix issues related to petals as a module (#159) Dec 16, 2022 .gitignore install script Jun 12, 2022 Dockerfile Downgrade CUDA in Docker image to 11.0.3 (#145) Dec 15, 2022 LICENSE Add MIT license Nov 24, 2022 README.md Add link to PyPI (#173) Dec 30, 2022 pyproject.toml Make Petals a pip-installable package (attempt 2) (#102) Nov 30, 2022 setup.cfg Bump version to 1.0.0 Dec 30, 2022 View code How does it work? Privacy and security Model's terms of use FAQ Installation [?] Development Code style README.md [68747470733a2f2f692e696d6775722e636f6d2f3765523750616e2e] Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading [6874747073] Generate text using distributed BLOOM and fine-tune it for your own tasks: from petals import DistributedBloomForCausalLM model = DistributedBloomForCausalLM.from_pretrained("bigscience/bloom-petals", tuning_mode="ptune", pre_seq_len=16) # Embeddings & prompts are on your device, BLOOM blocks are distributed across the Internet inputs = tokenizer("A cat sat", return_tensors="pt")["input_ids"] outputs = model.generate(inputs, max_new_tokens=5) print(tokenizer.decode(outputs[0])) # A cat sat on a mat... # Fine-tuning (updates only prompts or adapters hosted locally) optimizer = torch.optim.AdamW(model.parameters()) for input_ids, labels in data_loader: outputs = model.forward(input_ids) loss = cross_entropy(outputs.logits, labels) optimizer.zero_grad() loss.backward() optimizer.step() Try now in Colab Connect your own GPU and increase Petals capacity: # In an Anaconda env conda install pytorch cudatoolkit=11.3 -c pytorch pip install -U petals python -m petals.cli.run_server bigscience/bloom-petals # Or using our GPU-enabled Docker image sudo docker run --net host --ipc host --gpus all --volume petals-cache:/cache --rm \ learningathome/petals:main python -m petals.cli.run_server bigscience/bloom-petals If you have any issues or feedback, please join our Discord server! Check out more examples and tutorials: * Chatbot web app: link, source code * Training a personified chatbot: notebook * Fine-tuning BLOOM for text semantic classification: notebook * Launching your own swarm: tutorial * Running a custom foundation model: tutorial How does it work? * Petals runs large language models like BLOOM-176B collaboratively -- you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning. * Inference runs at [?] 1 sec per step (token) -- 10x faster than possible with offloading, enough for chatbots and other interactive apps. Parallel inference reaches hundreds of tokens/ sec. * Beyond classic language model APIs -- you can employ any fine-tuning and sampling methods by executing custom paths through the model or accessing its hidden states. You get the comforts of an API with the flexibility of PyTorch. [68747470733a2f2f692e696d6775722e636f6d2f525459463379572e706e67] Read paper Privacy and security The Petals public swarm is designed for research and academic use. Please do not use the public swarm to process sensitive data. We ask for that because it is an open network, and it is technically possible for peers serving model layers to recover input data and model outputs or modify them in a malicious way. Instead, you can set up a private Petals swarm hosted by people and organization you trust, who are authorized to process your data. We discuss privacy and security in more detail here. Model's terms of use Before building your own application that runs a language model with Petals, please check out the model's terms of use, risks, and limitations. In case of BLOOM, they are described in its model card and license. FAQ 1. What's the motivation for people to host model layers in the public swarm? People who run inference and fine-tuning themselves get a certain speedup if they host a part of the model locally. Some may be also motivated to "give back" to the community helping them to run the model (similarly to how BitTorrent users help others by sharing data they have already downloaded). Since it may be not enough for everyone, we are also working on introducing explicit incentives ("bloom points") for people donating their GPU time to the public swarm. Once this system is ready, people who earned these points will be able to spend them on inference/fine-tuning with higher priority or increased security guarantees, or (maybe) exchange them for other rewards. 2. Why is the platform named "Petals"? "Petals" is a metaphor for people serving different parts of the model. Together, they host the entire language model -- BLOOM. While our platform focuses on BLOOM now, we aim to support more foundation models in future. Installation Here's how to install Petals with conda: conda install pytorch cudatoolkit=11.3 -c pytorch pip install -U petals This script uses Anaconda to install CUDA-enabled PyTorch. If you don't have anaconda, you can get it from here. If you don't want anaconda, you can install PyTorch any other way. If you want to run models with 8-bit weights, please install PyTorch with CUDA 11 or newer for compatility with bitsandbytes. System requirements: Petals only supports Linux for now. If you don't have a Linux machine, consider running Petals in Docker (see our image) or, in case of Windows, in WSL2 (read more). CPU is enough to run a client, but you probably need a GPU to run a server efficiently. [?] Development Petals uses pytest with a few plugins. To install them, run: conda install pytorch cudatoolkit=11.3 -c pytorch git clone https://github.com/bigscience-workshop/petals.git && cd petals pip install -e .[dev] To run minimalistic tests, you need to make a local swarm with a small model and some servers. You may find more information about how local swarms work and how to run them in this tutorial. export MODEL_NAME=bloom-testing/test-bloomd-560m-main python -m petals.cli.run_server $MODEL_NAME --block_indices 0:12 \ --identity tests/test.id --host_maddrs /ip4/127.0.0.1/tcp/31337 --new_swarm &> server1.log & sleep 5 # wait for the first server to initialize DHT python -m petals.cli.run_server $MODEL_NAME --block_indices 12:24 \ --initial_peers SEE_THE_OUTPUT_OF_THE_1ST_PEER &> server2.log & tail -f server1.log server2.log # view logs for both servers Then launch pytest: export MODEL_NAME=bloom-testing/test-bloomd-560m-main REF_NAME=bigscience/bloom-560m export INITIAL_PEERS=/ip4/127.0.0.1/tcp/31337/p2p/QmS9KwZptnVdB9FFV7uGgaTq4sEKBwcYeKZDfSpyKDUd1g PYTHONPATH=. pytest tests --durations=0 --durations-min=1.0 -v After you're done, you can terminate the servers and ensure that no zombie processes are left with pkill -f petals.cli.run_server && pkill -f p2p. The automated tests use a more complex server configuration that can be found here. Code style We use black and isort for all pull requests. Before committing your code, simply run black . && isort . and you will be fine. --------------------------------------------------------------------- This project is a part of the BigScience research workshop. [68747470733a2f2f7065] About Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading petals.ml Topics bloom distributed-systems machine-learning deep-learning pytorch neural-networks language-models volunteer-computing large-language-models Resources Readme License MIT license Stars 1.3k stars Watchers 16 watching Forks 34 forks Releases 1 v1.0.0: The first stable release Latest Dec 30, 2022 Packages 0 No packages published Contributors 10 * @justheuristic * @borzunov * @dbaranchuk * @artek0chumak * @mryab * @TimDettmers * @GreenFatGuy * @vadi2 * @thomasw21 * @younesbelkada Languages * Python 96.8% * Shell 2.9% * Dockerfile 0.3% Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.