[HN Gopher] Show HN: WhatsApp-Llama: A clone of yourself from yo...
       ___________________________________________________________________
        
       Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp
       conversations
        
       Hello HN!  I've been thinking about the idea of a LLM thats a clone
       of me - instead of generating replies to be a helpful assistant, it
       generates replies that are exactly like mine. The concept's
       appeared in fiction numerous times (the talking paintings in Harry
       Potter that mimic the person painted, the clones in The Prestige),
       and I think with LLMs, there might actually be a possibility of us
       doing something like this!  I've just released a fork of the
       facebookresearch/llama-recipes which allows you to fine-tune a
       Llama model on your personal WhatsApp conversations. This
       adaptation can train the model (using QLoRA) to respond in a way
       that's eerily similar to your own texting style.  What I've figured
       out so far:  Quick Learning: The model quickly adapts to personal
       nuances, emoji usage, and phrases that you use. I've trained just 1
       epoch on a P100 GPU using QLoRA and 4 bit quantization, and its
       already captured my mannerisms  Turing Tests: As an experiment, I
       asked my friends to ask me 3 questions, and responded with 2
       candidate responses (one from me and one from llama). My friends
       then had to guess which candidate response was mine and which one
       was Llama's. Llama managed to fool 10% of my friends, but with more
       compute, I think it can do way better.  Here's the GitHub
       repository: https://github.com/Ads-cmu/WhatsApp-Llama/  Would love
       to hear feedback, suggestions, and any cool experiences if you
       decide to give it a try! I'd love to see how far we can push this
       by training bigger models for more epochs (I ran out of compute
       credits)
        
       Author : advaith08
       Score  : 73 points
       Date   : 2023-09-09 17:43 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jzemeocala wrote:
       | Awesome work, I've had the idea for a while of setting up a
       | pipeline like this that could take input from all available
       | sources of the person to clone their voice and image as well as
       | dialogue.
       | 
       | The intent being to create digital avatars of lost loved ones to
       | help people with the grieving process.
       | 
       | I know that there would be tremendous opportunity in such tech
       | for malicious actors to do serious harm, but the stated goal is
       | still a worthwhile endeavor.
        
         | advaith08 wrote:
         | Yeah, this is definitely a dicey ethical question. Would be
         | interested to know what guardrails you're considering for these
         | digital avatars, and how you'll ensure that people use them in
         | a healthy manner and dont get dependent on them.
        
       | codetrotter wrote:
       | > the talking paintings in Harry Potter that mimic the person
       | painted
       | 
       | I remember that the photos in the newspaper moving mimic the
       | person.
       | 
       | But I thought the talking paintings were ghosts living in the
       | paintings or something.
        
       | cypress66 wrote:
       | Llama 7B is quite dumb. Using the 13B you'd get significantly
       | better results, and you can train a qlora on a single 3090 (I
       | think even less is possible but not sure)
        
         | [deleted]
        
         | advaith08 wrote:
         | oh yeah definitely. Do you know how I can get access to one for
         | cheap though? I burnt through $150 just on this exercise with a
         | P100 on GCP
        
       | SubiculumCode wrote:
       | So discord, google and fb chats can pretty much do this
       | too...should have been obvious by now.
        
       | andai wrote:
       | A few years ago I did the same thing with GPT-2 on my friend and
       | I's WhatsApp conversation history.
       | 
       | So it would simulate conversations between us.
       | 
       | The result was hilarious yet at times uncomfortably accurate...
       | like looking into a mirror...
        
         | advaith08 wrote:
         | wow thats cool! Do you have the code put out somewhere?
        
       | f0e4c2f7 wrote:
       | Cool idea. One more fictional example:
       | 
       | https://www.youtube.com/watch?v=IWIusSdn1e4
        
       | gojomo wrote:
       | Good idea!
       | 
       | I expect there will be profitable businesses based on training
       | LLMs to simulate eminent people & celebrities - on both their
       | public utterances _and_ their private correspondence - then
       | charging for access to the best models.
        
         | [deleted]
        
         | advaith08 wrote:
         | yeah I think there are a lot of use cases for an assistant
         | trained on your chat history. Given how privacy sensitive this
         | use case is, I think maybe Apple is the best suited to build
         | something like this? Hope they come out with something cool
        
       | dools wrote:
       | > The concept's appeared in fiction numerous times (the talking
       | paintings in Harry Potter that mimic the person painted, the
       | clones in The Prestige)
       | 
       | How is your most notable example not when Gilfoyle does exactly
       | this so he doesn't have to talk to Dinesh in Silicon Valley??
        
         | advaith08 wrote:
         | Hahaha a friend told me this but I havent watched SV. Will do
         | so immediately
        
       | BasedAnon wrote:
       | i am screaming in horror on the inside
        
       | rosslazer wrote:
       | Nice! I did something similar with GPT 3.5 and slack
       | https://rosslazer.com/posts/fine-tuning/
        
         | advaith08 wrote:
         | Very cool. I like the introspection bit, I've realised quite a
         | bit about my texting style from talking to Llama too. I think
         | Im also very "type first and think later" on WhatsApp
        
       | RockstarSprain wrote:
       | Very interesting! I'm wondering if anyone attempted something
       | similar in Telegram though.
        
       | brap wrote:
       | I wonder if my clone will also respond 3 days later as if nothing
       | happened
        
       | olvy0 wrote:
       | I was immediately reminded of this black mirror episode:
       | 
       | https://en.m.wikipedia.org/wiki/Be_Right_Back
        
       | alt-glitch wrote:
       | Super cool! I had a similar idea where I wanted to create such
       | clones of some of my friends (with consent ofc) and see how well
       | they know me. To extend your clone even more, you can also throw
       | in every piece of digital text you have into this, eg. emails,
       | notes, essays, blogs etc. I'm super down to work on LLM clones
       | like these!
       | 
       | edit: I actually started a little work on this. If you wanna
       | export more messages than the limited 40k, you can use [0]. I did
       | and I have every text I've ever sent since I had WhatsApp.
       | 
       | [0]: https://github.com/YuvrajRaghuvanshiS/WhatsApp-Key-
       | Database-...
        
         | advaith08 wrote:
         | Thank you, Ill check it out!
         | 
         | Yeah, this can be extended to create a "simulation game" of us
         | and our friends. This paper (Interactive Simulacra of Human
         | Behaviour https://arxiv.org/abs/2304.03442 ) has a setup on how
         | we could create a Sims game with us as the characters
        
       | jmkni wrote:
       | This is cool, although I'm guessing you need to input your
       | conversation history manually? Or is there a way to export it
       | from WhatsApp?
        
         | advaith08 wrote:
         | here's how you can export your chat history:
         | https://faq.whatsapp.com/1180414079177245/?cms_platform=andr...
        
         | porridgeraisin wrote:
         | WhatsApp can export last 40k messages in any chat.
        
       | oDot wrote:
       | We are very close to where AI tech can replicate Harry Potter
       | portraits
        
         | 101008 wrote:
         | askdumbledore.org was done a few months ago in fact
        
       | porridgeraisin wrote:
       | Nice. I remember thinking of doing something like this when I was
       | much much more of a novice. I wrote a WhatsApp message parser and
       | thought of doing this with the parsed messages. Unfortunately I
       | knew too little back then, and Llama didn't exist either. Cool to
       | see it!
        
       | tloriato wrote:
       | You said that the model fooled your friends 10% of the time.
       | 
       | I wonder how well would chatGPT | llama2 do given just the last 5
       | messages of each and asking to generate the next reply pre
       | tending to be you...
       | 
       | Somehow I don't think it would be worse?
        
         | advaith08 wrote:
         | Yeah I wondered if few shot prompting would yield better
         | results than finetuning. For the amount of finetuning I've done
         | (1 epoch, 7B model with 4 bit quantization), I think it might
         | be comparable. But if we scale this to a bigger model and
         | longer training times, I think finetuning should produce much
         | better results. Hoping someone with access to compute will try
         | it out and update us!
        
       ___________________________________________________________________
       (page generated 2023-09-09 23:00 UTC)