https://github.com/Ads-cmu/WhatsApp-Llama Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Customer Stories + White papers, Ebooks, Webinars + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} Ads-cmu / WhatsApp-Llama Public forked from facebookresearch/llama-recipes * Notifications * Fork 470 * Star 88 Finetune a LLM to speak like you based on your WhatsApp Conversations License View license 88 stars 470 forks Activity Star Notifications * Code * Pull requests 0 * Actions * Projects 0 * Security * Insights More * Code * Pull requests * Actions * Projects * Security * Insights Ads-cmu/WhatsApp-Llama This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 1 branch 0 tags Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/A] Use Git or checkout with SVN using the web URL. [gh repo clone Ads-cm] Work fast with our official CLI. Learn more about the CLI. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. This branch is 15 commits ahead, 71 commits behind facebookresearch:main. Latest commit Advaith Sridhar and Advaith Sridhar fixed minor bugs ... 007f124 Sep 9, 2023 fixed minor bugs 007f124 Git stats * 199 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .github Create spellcheck.yml (facebookresearch#50) July 29, 2023 18:12 configs Update datasets.py September 5, 2023 17:48 docs Update inference.md (facebookresearch#136) September 1, 2023 14:28 ft_datasets Create whatsapp_dataset.py September 5, 2023 17:50 inference passing input_ids as peft doesn't pass position args to base_model September 1, 2023 23:31 model_checkpointing address facebookresearch#87 August 2, 2023 21:09 policies Save memory and fix typos July 25, 2023 14:24 scripts Fix lint error August 31, 2023 17:30 utils Update dataset_utils.py September 5, 2023 17:53 .gitignore fixed minor bugs September 9, 2023 12:13 CODE_OF_CONDUCT.md Initial commit July 17, 2023 01:19 CONTRIBUTING.md Initial commit July 17, 2023 01:19 LICENSE Initial commit July 17, 2023 01:19 README.md fixed minor bugs September 9, 2023 12:13 UPDATES.md add prompt update in a separate file August 8, 2023 00:49 USE_POLICY.md Initial commit July 17, 2023 01:19 WhatsApp_Finetune.ipynb fixed minor bugs September 9, 2023 12:13 llama_finetuning.py Remove micro_batch_training parameter and replace with gradient_accum... August 31, 2023 17:30 multi_node.slurm Initial commit July 17, 2023 01:19 prepare_dataset.py fixed minor bugs September 9, 2023 12:13 preprocessing.py fixed minor bugs September 9, 2023 12:13 requirements.txt adding flash attention and xformer memory efficient through PT SDPA August 4, 2023 22:16 View code [ ] WhatsApp-Llama: Fine-tune Llama 7b to Mimic Your WhatsApp Style My Results Getting Started 1. Exporting WhatsApp Chats 2. Preprocessing the Dataset Convert text files to json: Convert json files to csv 3. Validating the Dataset 4. Model Configuration 5. Training Time Conclusion README.md WhatsApp-Llama: Fine-tune Llama 7b to Mimic Your WhatsApp Style This repository is a fork of the facebookresearch/llama-recipes, adapted to fine-tune a Llama 7b chat model to replicate your personal WhatsApp texting style. By simply inputting your WhatsApp conversations, you can train the LLM to respond just like you do! Llama 7B chat is finetuned using parameter efficient finetuning (QLoRA) and int4 quantization on a single GPU (P100 with 16GB gpu memory). My Results 1. Quick Learning: The fine-tuned Llama model picked up on my texting nuances rapidly. + The average words generated in the finetuned Llama is 300% more more than vanilla Llama. I usually type longer replies, so this checks out + The model accurately replicated common phrases I say and my emoji usage 2. Turing Test with Friends: As an experiment, I asked my friends to ask me 3 questions on WhatsApp, and responded with 2 candidate responses (one from me and one from the LLM). My friends then had to guess which candidate response was mine and which one was Llama's. The result? The model fooled 10% (2/20) of my friends. Some of the model's responses were eerily similar to my own. Here are some examples: * Example 1: image * Example 2: image I believe that with access to more compute, this number could easily be pushed to ~40% (which would be near random guessing). Getting Started Here's a step-by-step guide on setting up this repository and creating your own customized dataset: 1. Exporting WhatsApp Chats Details on how to export your WhatsApp chats can be found here. I exported 10 WhatsApp chats from friends who I speak to often. Be sure to exclude media while exporting. Each chat was saved as Chat.txt. 2. Preprocessing the Dataset Complete the steps below to convert the exported chat into a format suitable for training: Convert text files to json: python preprocessing.py 1. your_name refers to your name (Llama will learn this name) 2. your_contact_name refers to how you've saved your number on your phone 3. friend_name refers to the name of your friend (Llama will learn this name) 4. friend_contact_name refers to the name you've used to save your friend's contact 5. folder_path should be the path in which you've stored your whatsapp chats. You'll need to run this command once for every friend's chat you've exported Convert json files to csv Once you're done converting all texts to json, you can run the command below to create the dataset python prepare_dataset.py 1. dataset_folder refers to the folder with your json files 2. your_name refers to your name (Llama will learn this name) 3. save_file file path of the final csv 3. Validating the Dataset Here's the expected format for the preprocessed dataset: | ID | Context | Reply | | -- | ---------- | ---------- | | 1 | You: Hi | What's up? | | | Friend: Hi | | Ensure your dataset looks like the above to verify you've done it correctly. 4. Model Configuration Once you're done with the above steps, run WhatsApp_Finetune.ipyb * If you're using a P100 GPU, load the model in 4 bits: * If you're using an A100 GPU, you can load the model in 8 bits: PEFT adds around 4.6M parameters, or 6% of total model weights. Additionally, you'll need to make the following 2 changes to ft_datasets/whatsapp_dataset.py: 1. Update the prompt to one of your choosing (line 8) 2. Update the file path of your dataset in the dataset.load_dataset () command (line 5) 5. Training Time For reference, a 10MB dataset will complete 1 epoch in approximately 7 hours on a P100 GPU. My results shared above were achieved after training for just 1 epoch. Conclusion This adaptation of the Llama model offers a fun way to see how well a LLM can mimic your personal texting style. Remember to use AI responsibly and inform your friends if you're using the model to chat with them! --------------------------------------------------------------------- About Finetune a LLM to speak like you based on your WhatsApp Conversations Resources Readme License View license Code of conduct Code of conduct Activity Stars 88 stars Watchers 0 watching Forks 470 forks Report repository Releases No releases published Packages 0 No packages published Languages * Jupyter Notebook 52.6% * Python 47.1% * Shell 0.3% Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time.