[HN Gopher] How to Finetune GPT-Like Large Language Models on a ...
       ___________________________________________________________________
        
       How to Finetune GPT-Like Large Language Models on a Custom Dataset
        
       Author : T-A
       Score  : 380 points
       Date   : 2023-05-25 10:06 UTC (12 hours ago)
        
 (HTM) web link (lightning.ai)
 (TXT) w3m dump (lightning.ai)
        
       | sandGorgon wrote:
       | has anyone here used EasyLM ? it seems the most used for the best
       | finetuned models out there.
        
       | zhwu wrote:
       | It seems training the Vicuna on custom dataset could be quite
       | easy as well, according to the following:
       | https://github.com/skypilot-org/skypilot/tree/master/llm/vic...
        
       | quickthrower2 wrote:
       | When is fine tuning worth it, rather than just prompt
       | engineering?
        
         | messe wrote:
         | When you're starting to run into context limits.
        
         | tstrimple wrote:
         | From what I've seen, it's when embeddings get too large for the
         | token limit or the embeddings drive the cost up too much
         | because you're always operating near the max token limit. In
         | those cases, it may be worth the up front training cost and
         | slightly higher per-token cost to dramatically reduce the
         | amount of tokens in the average request. If you're building a
         | higher throughput solution, the difference in cost can be quite
         | large.
        
         | snovv_crash wrote:
         | If you want to teach it eg. all of the text in your private
         | training manuals and internal documentation, which wouldn't fit
         | in the input token size.
        
         | oddthink wrote:
         | It's worth it whenever you have a reasonable amount of training
         | data. You can get substantial quality improvements
         | automatically. Unless you're doing some kind of prompt-
         | optimization, prompt-tuning is a lot of random guessing and
         | trial-and-error. It's also most necessary when you have a
         | smaller base model, as opposed to one of the big ones.
        
         | heliophobicdude wrote:
         | I think these are two very separate concepts.
         | 
         | What we are mostly seeing when it comes to fine-tuning is
         | making a model promptable. Models like LLaMA or the original
         | GPT3 weren't promptable. They were fine-tuned with
         | demonstration data that looks like a prompt input, prompt
         | output.
         | 
         | See below: { "instruction": "What would be the output of the
         | following JavaScript snippet?", "input": "let area = 6 *
         | 5;\nlet radius = area / 3.14;", "output": "The output of the
         | JavaScript snippet is the radius, which is 1.91." }, [1]
         | 
         | Prompt engineering is really just carefully designing what
         | inputs and outputs on a prompt-ready model work best.
         | 
         | I highly recommend skimming this RLHF article and looking for
         | the parts where it talks about demonstration data [2]
         | 
         | 1:
         | https://github.com/sahil280114/codealpaca/blob/master/data/c...
         | 
         | 2: https://huyenchip.com/2023/05/02/rlhf.html
        
           | baobabKoodaa wrote:
           | Prompt engineering and fine tuning are in many cases
           | alternative ways to achieve the same goal. You claim that the
           | "original GPT3" wasn't promptable. I'm unsure which version
           | you refer to, but I'm guessing you refer to text-davinci-003
           | and it was definitely promptable. For one app I used prompt
           | engineering to make it behave like a spirit talking through a
           | ouija board. For another, I used prompt engineering to make
           | it act like a dystopian search engine from the future. So,
           | yeah, it's promptable.
        
           | quickthrower2 wrote:
           | Thanks for link 2 - it is worth a proper read! Read half of
           | it already and it is very interesting and useful for
           | understanding this.
        
             | heliophobicdude wrote:
             | Cheers!
        
       | nomagicbullet wrote:
       | Is there are Dreambooth equivalent for fine-tuning ChatGPT as
       | there is for Stable Diffusion? I have to imagine that if we can
       | add custom data to a DL text-to-image model, we should be able to
       | do the same with a text-to-text one.
       | 
       | Edit to add: There are a number of Google Colabs for fine-tuning
       | SD and I wonder if there are (or if it is technically feasible)
       | to accomplish the same with other txt2txt models.
        
         | SparkyMcUnicorn wrote:
         | These aren't for ChatGPT, but work on LLaMA, Vicuna, etc.
         | 
         | https://github.com/oobabooga/text-generation-webui/blob/main...
         | 
         | https://github.com/zetavg/LLaMA-LoRA-Tuner
         | 
         | https://github.com/h2oai/h2o-llmstudio
         | 
         | https://github.com/rhulha/lora
        
         | a5huynh wrote:
         | If you're running the text-generation-webui
         | (https://github.com/oobabooga/text-generation-webui) it has the
         | ability to train LoRAs.
         | 
         | It'll require a beefy GPU but I've seen some fun examples like
         | someone training a LoRA on Skyrim books.
        
       | artembugara wrote:
       | Have a question to the Generative AI experts here.
       | 
       | So, I can use smthg like GPT-4 to label data and then use that as
       | a train set for my own LLM, right?
       | 
       | EDIT: adding this from OpenAI Restriction TOS: "(iii) use output
       | from the Services to develop models that compete with OpenAI;"
        
         | montenegrohugo wrote:
         | Yup, totally. This is a form of knowledge distillation. Openai,
         | or other foundational model providers, can't really do anything
         | about it.
        
           | cookieperson wrote:
           | Well they can sue you and bankrupt you by delaying trial for
           | a decade. That's how the US patent system works anyways...
        
             | sanxiyn wrote:
             | Sue on what grounds? It will be quickly dismissed.
        
         | foobarbecue wrote:
         | Is "ca" "can" or "can't"?
        
           | artembugara wrote:
           | can
        
         | wodenokoto wrote:
         | It is my understanding that this is how "alignment" works.
         | 
         | That is, openAI paid people to chat with their LLM to fine tune
         | it and then other LLMs use chatgpt to generate training data to
         | align their models.
        
           | visarga wrote:
           | There are three ways
           | 
           | 1. make your own RLHF dataset - like OpenAI and Open
           | Assistant
           | 
           | 2. exfiltrate data from a bigger/better LLM - Vicuna & family
           | 
           | 3. use your pre-trained LLM to generate RLAIF data, no
           | leeching - ConstitutionalAI, based on a set of rules instead
           | of labelling examples
        
             | cubefox wrote:
             | I wonder whether these approaches fit into the above
             | categories:
             | 
             | https://arxiv.org/abs/2305.13735
             | 
             | https://arxiv.org/abs/2305.11206
        
         | notpublic wrote:
         | not an AI expert but from a talk I recently heard... if there
         | is a mismatch in training data between the "teacher" LLM and
         | "student" LLM, you risk teaching the student to hallucinate or
         | to ignore information
        
         | chaxor wrote:
         | Yes, and in fact that's the best method available if you want
         | good performance. I would suggest using a local open source
         | model to do this however, to cut down on costs and make it far
         | simpler to deal with than the unwieldy OpenAI systems.
         | 
         | https://arxiv.org/pdf/2305.02301.pdf
        
         | moffkalast wrote:
         | > I can use smthg like GPT-4 to label data and then use that as
         | a train set for my own LLM, right?
         | 
         | Yes, almost all improved LLama models are tuned exactly that
         | way (trained on examples of questions and answers from say GPT
         | 4). If OpenAI stole copyrighted works to train their models it
         | is morally fair game to do the same to them regardless of their
         | TOS. It's not like they can prove it anyway.
         | 
         | Plus there's the other point where they also say that
         | everything generated by their models is public domain, so which
         | one is it eh?
        
           | sirsinsalot wrote:
           | This ... but we all know business is corrupt.
           | 
           | The current attempts to spur on regulation by OpenAI is moat
           | building
        
           | Fgehono wrote:
           | Because by training it they created something new.
           | 
           | I don't mind just making a point.
           | 
           | But I don't think they mind. I don't believe that this type
           | of model training is able to be bleeding edge which should
           | guarantee that openai has enough motivation to continue the
           | development and having a healthy competition
        
           | fnordpiglet wrote:
           | Use of copyrighted material in such a way that it's
           | aggregated into statistical properties is almost certainly
           | fair use. Use of the model to produce reproductions of
           | copyrighted material then consuming or distributing it is
           | almost certainly violating the copyright. But it was the
           | facsimile of the material that's the violation, not the
           | abstract use of it to generate an aggregate model.
        
             | tsunamifury wrote:
             | You understand these things have a very very wide
             | interpretation scope here that has yet to be tested in
             | court. I wouldn't make these statements so confidently as
             | courts tend to reinterpret the law significantly for the
             | balance of societal factors when serious technology changes
             | occur.
        
               | fnordpiglet wrote:
               | This is true - afaik there's been no specific rulings on
               | whether training models on copyright material is a
               | violation. But to my mind it harkens back to stuff like
               | xerox and such where the tool itself isn't the violating
               | thing it's the use of the tool. Likewise, derivative
               | works are often largely reproductions with minor
               | variations and are protected under fair use. A model that
               | takes enormous amounts of data and distills it into a
               | tiny vector representation way below the information
               | theoretic levels for any meaningful fidelity and mixes
               | and overlaps data in a way that the original data isn't
               | plausibly stored in the model... I'm definitely not going
               | to wager my life that's fair use, but I would wager my
               | company on it.
        
               | tsunamifury wrote:
               | In the history of media law I've seen judged lean into
               | whatever interpretation balances the ecosystem more than
               | what is "literally the law". The law is meant to serve
               | people not the other way around. I hope judges will
               | understand the contribution and theft can't just be "haha
               | fuck humanity love, openAI"
        
               | fnordpiglet wrote:
               | Ok, what about the open source and research models? I
               | wouldn't wager much on openai keeping a lead
               | indefinitely. Certainly not to establish case law on
               | what's a pretty new technology (at least in its current
               | use)
        
               | jjoonathan wrote:
               | Yes, laws are about politics and dispute resolution more
               | than reasoning or correctness. Focusing on the pure logic
               | is a trap for the computationally inclined.
        
               | itake wrote:
               | AI generated work is not copyright-able. I guess the
               | courts later could disagree though.
               | 
               | https://www.copyright.gov/ai/
        
               | belter wrote:
               | If the AI generates a new Eric Clapton album, with the
               | same similar voice and guitar playing style?
        
               | itake wrote:
               | your example doesn't have to be AI generated. Human
               | cover-bands play Song X in the style of Y all the time.
        
           | jrm4 wrote:
           | I'm a lawyer so one should never break the law.
           | 
           | Nonethless, I can observe and predict that non-consensual
           | "open sourcing" of these models would likely end up probably
           | the best and safest way to do all of this stuff.
        
           | sp332 wrote:
           | It's against the terms of service to do the generation, but
           | the generated text is not copyrighted. Those are different
           | things.
        
             | cameldrv wrote:
             | GPT-4 is trained on a large number of web pages, some of
             | which will have had their own terms of service.
        
               | svaha1728 wrote:
               | Not only web sites, full books from scribd and other
               | sources.
        
               | asah wrote:
               | see LinkedIn vs HiQ (which HiQ won) covering fair use of
               | logged-out web pages.
        
         | snickmy wrote:
         | Indeed, fine tuning with either synthetic data (as you are
         | proposing) or human review works like that. you can read more
         | here: https://huggingface.co/blog/rlhf
        
         | fallingmeat wrote:
         | That is against their ToS though if you use your new LLM
         | commercially.
        
           | artembugara wrote:
           | As far as I remember, I fully own all the right to the output
           | of OpenAI (for example).
        
             | dingledork69 wrote:
             | I wonder how they reconcile naming themselves "Open"AI,
             | telling people that generated works can be used however
             | they please, except for training a potential competitor.
        
           | ramesh1994 wrote:
           | It prohibits anything that competes with OpenAI services i.e
           | as long as you're not literally providing an LLM API
           | commercially you should be fine
        
             | bagels wrote:
             | Does it compete with them if you stop paying for their API?
        
               | [deleted]
        
           | vlovich123 wrote:
           | And yet they trained theirs on commercial content on the
           | internet. If that's legal I doubt their argument holds up in
           | court right?
        
             | dragonwriter wrote:
             | They trained on publicly-available (no signup with TOS
             | agreement) data, on the theory that training is fair use.
             | 
             | You signed up and agreed to their TOS to use GPT-4.
             | 
             | The legal situations are not similar.
             | 
             | OTOH, lots of people _are_ openly using GPT-4 in one way or
             | another to develop models, though they might generally be
             | at arm's length from people intending to sell services.
        
               | flangola7 wrote:
               | > They trained on publicly-available (no signup with TOS
               | agreement) data, on the theory that training is fair use.
               | 
               | They openly state they used thousands of books from a
               | pirate site as a training source. Go look up the datasets
               | listed in the GPT-3 paper.
        
               | snovv_crash wrote:
               | So set up a shell company that uses GPT4 to make public
               | domain examples of what RLHF data would look like, and
               | then the parent company takes that data afterwards since
               | it's public domain. Shell company didn't break TOS.
        
             | sanxiyn wrote:
             | Of course it will hold up in court, it's their service and
             | their terms of service.
        
           | pmoriarty wrote:
           | So what are they going to do about it?
        
             | jstummbillig wrote:
             | That escalated quickly.
        
             | fallingmeat wrote:
             | Great question! I don't know the end game there. Maybe if
             | they suspected their model was used they would sue, and in
             | discovery find you used their model for training?
        
               | visarga wrote:
               | Maybe we don't need to worry, OpenLLaMA is under training
               | right now. It will be the commercial version of LLaMA.
               | 
               | > Update 05/22/2023
               | 
               | > We are happy to release our 700B token checkpoint for
               | the OpenLLaMA 7B model and 600B token checkpoint for the
               | 3B model. We've also updated the evaluation results. We
               | expect the full 1T token training run to finish at the
               | end of this week.
               | 
               | https://github.com/openlm-research/open_llama
               | 
               | So we could develop on LLaMA for now and switch to
               | OpenLLaMA later.
        
             | dragonwriter wrote:
             | > So what are they going to do about it?
             | 
             | If they think they can prove you used it to develop a
             | competing service, sue you for breaking the TOS and recover
             | the greater of the harm it did to their business or the
             | amount of your profits from the service that are due to the
             | uae of GPT-4 in violation of the agreement.
        
               | pmoriarty wrote:
               | Have companies managed to get awarded damages in lawsuits
               | against their customers who merely broke their terms of
               | service?
               | 
               | Is there existing case law here?
        
             | sanxiyn wrote:
             | They can terminate your account.
        
             | postsantum wrote:
             | MS lawyers have a good track record at sending out those
             | scary cease&desist letters
        
               | sanxiyn wrote:
               | I don't think that works. LLM-generated contents are not
               | copyrightable.
        
               | dragonwriter wrote:
               | Breach of contract for violating the TOS agreed to when
               | signinf uo for the service doesn't depend on copyright.
        
               | aix1 wrote:
               | What I don't understand - is there anything that would
               | prevent Alice from publishing ChatGPT prompts and outputs
               | for anyone to use, with no T&C attached?
               | 
               | Once Alice has done that, is there anything to prevent
               | Bob, who has never agreed to ChatGPT ToS, to use those
               | prompts and outputs to train his own models to compete
               | with OpenAI's?
               | 
               | (Purely from a contractual/legal/IP angle rather than
               | ML/technical.)
        
               | nightski wrote:
               | Right but cease and desist usually relates to
               | intellectual property or copyright matters, typically not
               | TOS violations. Please correct me if I am mistaken.
        
               | dragonwriter wrote:
               | Cease and desist can be used for any issues where the
               | person or entity issuing the C&D thinks they have a legal
               | right that is being violated and wants to put the
               | violator on notice in the hopes of securing a change in
               | behavior short of legal action.
        
               | [deleted]
        
               | pmoriarty wrote:
               | Is a terms of service considered a contract?
        
             | bottled_poe wrote:
             | Nothing until it's worth their while.
        
       | hospitalJail wrote:
       | Has anyone tried to use this?
       | 
       | The guide obv didn't make usable code and the github looks nearly
       | unrelated.
       | 
       | I'm somewhat surprised there isnt a parameter for 'input_data'
       | and 'output_data' and it returns a trained model. I can't figure
       | out why there is so much boilerplate when that stuff could be
       | contained as parameters.
        
       | swalsh wrote:
       | How does this compare to fine tuning something like BERT?
        
         | theaniketmaurya wrote:
         | I would say similar since the building block is the transformer
         | for both. In this blog post, the fine-tuning strategy used is
         | Adapter. It basically adds a learnable layer to the Transformer
         | block.
        
       | jpe90 wrote:
       | Would it be feasible to fine-tune a large, capable model (like
       | the recent LIMA) on the source code (and maybe a few high quality
       | libraries) of a niche language, such that it's much better at
       | helping you write and understand it?
       | 
       | Imagine how many doors it would open if you could fine-tune
       | models capable of writing language bindings for you and keeping
       | them up to date.
        
         | tazjin wrote:
         | Totally. GPT-4 can already do this, untuned, on niche languages
         | and libraries. One of the main problems is still that you don't
         | know when it's hallucinating a function or whatever though.
        
       | Obscurity4340 wrote:
       | This looks like the Orion broswer logo
        
       | nico wrote:
       | What is the main difference between training and fine tuning?
       | 
       | Can you start with a model trained only in producing the letter
       | a, and then fine tune it to learn b, then c, then words,
       | sentences, etc?
        
         | swalsh wrote:
         | Not an expert, but my high level understanding is this: If a
         | model is a set of inputs, some middle layers, and a set of
         | outputs. Fine tuning concentrates on only the output layers.
         | 
         | Useful for taking a generic model with a base level of
         | knowledge, and tuning it so the output is more useful for an
         | application specific use case.
        
           | ajb117 wrote:
           | I think that's more in line with transfer learning, a variant
           | of fine-tuning. If I'm reading this article correctly,
           | they're fine-tuning the LMs end-to-end.
        
         | worldsayshi wrote:
         | Yeah, since fine tuning seems to be so much more cheaper than
         | training why haven't OpenAI fine tuned ChatGPT on data past
         | 2021?
        
           | heliophobicdude wrote:
           | One argument is that it can contaminate training data from
           | output of itself or other models.
           | 
           | We have already documented evidence of the effect of this. In
           | the GPT-4 technical report [1], they reported contamination
           | of humaneval data in the training data.
           | 
           | They did measure against a "non-contaminated" training set
           | but no idea if that can still be trusted.
           | 
           | Why would this matter? We can have seemingly strong
           | benchmarks for containments but measures poorly against new
           | and quarantined information. Classic over fitting.
           | 
           | Another argument is that data being put out there could very
           | much be wrong and the amounts of it amplified by other
           | models. Take a look at this sample of demonstration data for
           | codealpaca [2]. Not only is its output wrong but bad
           | practices like,making up a random computation without it
           | having access to a place to run a calculation, teaches the
           | model these type of responses are ok.
           | 
           | { "instruction": "What would be the output of the following
           | JavaScript snippet?", "input": "let area = 6 * 5;\nlet radius
           | = area / 3.14;", "output": "The output of the JavaScript
           | snippet is the radius, which is 1.91." }
           | 
           | 1: https://cdn.openai.com/papers/gpt-4.pdf 2: https://github.
           | com/sahil280114/codealpaca/commit/0d265112c70...
        
           | ajb117 wrote:
           | My guess is that it's because they've already done RLHF on
           | top of the standard next token prediction. In other words,
           | they can't cheaply fine tune ChatGPT without undoing the RLHF
           | objective by training on next token prediction with post-2021
           | data, and then retraining with RLHF to make sure it still
           | gives good human-like output.
           | 
           | I mention the "undoing RLHF" since it's not uncommon for
           | fine-tuned models to increase in error in the original
           | training objective after being fine-tuned with a different
           | one. I think people saw this happen in BERT.
           | 
           | Also ChatGPT is almost certainly huge.
        
         | londons_explore wrote:
         | Ideally you train a model right to begin with, and no fine
         | tuning is necessary.
         | 
         | However, sometimes you can't do that. For example, perhaps you
         | want your model to always talk like a pirate, but you don't
         | have billions of words spoken like a pirate to train on.
         | 
         | So the next best thing is to train a model on all english text
         | (which you have lots of), and then _finetune_ on your smaller
         | dataset of pirate speech.
         | 
         | Finetuning is simply more training, but with a different
         | dataset and often a different learning rate.
         | 
         | Typically, finetuning uses far far far less data and compute,
         | and can be done by individuals with a home PC, whereas training
         | a large language model from scratch is in the $1M - $1B range.
        
         | [deleted]
        
       | stoptrlling wrote:
       | Anyone knows the computational cost of training with these LoRa
       | designs? Given that we are talking about rates of token per
       | seconds, it seems training a bigger dataset could be extremely
       | expensive
        
         | t-vi wrote:
         | The adapter and LoRa have a drastically fewer parameters, so
         | one might expect that forward + backward is roughly 2x the cost
         | of forward.
         | 
         | Then (as far as I know), in contrast to generation, training is
         | done on the entire output of the transformer (so all tokens of
         | the full input) rather than serially token-by-token (in the RNN
         | days, this was called teacher-forcing), so that may give you a
         | significant boost in the tokens per second rate over
         | generation.
        
       | akrymski wrote:
       | These NanoGPT based models are great, thank you for contributing
       | to OS. Would love to see this ported to CPUs ala llama.cpp. Any
       | plans in that direction?
        
       | mercurialsolo wrote:
       | While the fine-tuning pipeline is fairly straightforward for
       | tuning and building custom models, the RLHF pipeline doesn't look
       | to be as straightforward. Creating a dataset for RLHF seems like
       | a fairly labour intensive exercise especially if your model is
       | tuned to do work like code generation ?
       | 
       | What about the Replit Ghostwriter? Did it have a RLHF phase?
        
       | slenocchio wrote:
       | Can someone explain why I'd want to use fine-tuning instead of a
       | vector database (or some other way of storing data/context)?
        
         | morgango wrote:
         | I asked ChatGPT this question, and asked it to simplify as much
         | as possible.
         | 
         | Fine-tuned Models: Imagine you have a super-smart robot that
         | can talk about anything. But you want it to be really good at
         | talking about, say, dinosaurs. So, you teach it more about
         | dinosaurs specifically. That's what fine-tuning is - you're
         | teaching the robot (or model) to be really good at a specific
         | topic.
         | 
         | Vector Databases and Embeddings with LLM: This might be a
         | little tricky, but let's think of it this way. Imagine you have
         | a huge library of books and you want to find information on a
         | specific topic, say, ancient Egypt. Now, instead of reading
         | every book, you have a magical index that can tell you which
         | books talk about ancient Egypt. This index is created by
         | magically converting each book into a "summary dot" (that's the
         | embedding). When you ask about ancient Egypt, your question is
         | also converted into a "summary dot". Then, the magical index
         | finds the books (or "summary dots") that are most similar to
         | your question. That's how the vector database and embeddings
         | work.
         | 
         | So, if you want your super-smart robot to be really good at one
         | specific topic, you use fine-tuning. But if you want it to
         | quickly find information from a huge library of knowledge, you
         | use vector databases and embeddings. Sometimes, you might even
         | use both for different parts of the same task!
        
         | mgfist wrote:
         | First reason that comes to mind is you can make much smaller
         | models, which helps with latency, cost and may enable you to
         | run the model locally.
        
         | pid-1 wrote:
         | I've been playing with using documents as OpenAI embeddings for
         | the past weeks and, at least for my use case, the results are
         | meh. It seems sometimes just using context is not enough.
         | 
         | My next step is to play with fine tunning, but I have no
         | results to report yet.
        
           | akiselev wrote:
           | Try using InstructXL for embeddings. It's got a more complex
           | prompt structure for generating embeddings which might be
           | more useful
        
           | deforciant wrote:
           | have you tried other models to generate embeddings? I am
           | going to that direction too to create an additional layer of
           | helpers for search. Also, thinking if the document is not too
           | big, it might fit into the initial context with the prompt
        
           | santiagobasulto wrote:
           | I'd be very interested in knowing the outcome. Do you blog
           | anywhere (or post on social)?
        
         | oddthink wrote:
         | Wouldn't a vector database just get you nearest-neighbors on
         | the embeddings? How would that answer a generative or
         | extractive question? I can see it might get you sentiment, but
         | would it help with "tell me all the places that are mentioned
         | in this review"?
        
           | superchink wrote:
           | i think the point is that you use the vector database to
           | locate the relevant context to pass to the LLM for question
           | answering. here's an end-to-end example:
           | 
           | https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot
        
         | heliophobicdude wrote:
         | Assuming you would want to fine-tune over a codebase or set of
         | documents, I would argue vector databases and fine-tuning are
         | completely different tools.
         | 
         | I would strongly recommend against fine-tuning over a set of
         | documents as this is a very lossy information system retrieval
         | system. LLMs are not well suited for information retrieval like
         | databases and search engines.
         | 
         | The applications of fine-tuning that we are seeing have a lot
         | of success is making completion models like LLaMA or original
         | GPT3 become prompt-able. In essence, prompt-tuning or
         | instruction-tuning. That is, giving it the ability to respond
         | with a user prompt, llm output chat interface.
         | 
         | Vector databases, for now, are a great way to store mappings of
         | embeddings of documents with the documents themselves for
         | relevant-document information retrieval.
         | 
         | I would highly recommend skimming this RLHF paper for how
         | demonstration data was used to make a model prompt-able [1].
         | Keep in mind RLHF is another concept all together and we might
         | be seeing a revolution where it might become optional (thanks
         | to LIMA)!
         | 
         | 1: https://huyenchip.com/2023/05/02/rlhf.html
        
         | mountainriver wrote:
         | I think it probably works a lot better, but I would love to see
         | some research validating this
        
           | chadash wrote:
           | I've read in a few places that it actually works worse in
           | most cases. Much better to put the context in your prompt.
        
             | CuriouslyC wrote:
             | Fine tuning + context will outperform context alone, and
             | it's cheaper to burn cycles fine tuning then use a smaller
             | context than to use a larger context in production.
        
               | Guillaume86 wrote:
               | Fine tuning + same context will probably outperform
               | context alone, but if you use a smaller context that does
               | not seem to work that well as GP stated.
        
         | swalsh wrote:
         | Fine Tuning = Output
         | 
         | Embeddings = Input
         | 
         | Fine-tuning is like a chef modifying a general pizza recipe to
         | perfect a specific pizza, such as Neapolitan. This
         | customization optimizes the result. In AI, fine-tuning adjusts
         | a pre-existing model to perform better on a specific task.
         | 
         | Embeddings are like categorizing ingredients based on
         | properties. They represent inputs so that similar inputs have
         | similar representations. For instance, 'dog' and 'puppy' in an
         | AI model have similar meanings. Like ingredients in a pizza,
         | embeddings help the model understand and interpret the inputs.
         | So, fine-tuning is about improving the model's performance,
         | while embeddings help the model comprehend its inputs.
         | 
         | It turns out, you can search a vector space of embeddings to
         | find similar embeddings. If I turned my above post into 2
         | embeddings, and you searched for "golden retreiver" though
         | neither paragraph has that exact phrase, the model should know
         | a golden retreiver is most similar to the second paragraph that
         | compares puppy to dog.
        
           | SparkyMcUnicorn wrote:
           | I like to think of an LLM as a literal human. Not sure if
           | it's the best analogy.
           | 
           | Fine tuning = Adding years of experience, in a set
           | environment. e.g. Raise them in a home that only speaks in
           | old english, learn pig latin, send them to a bootcamp.
           | 
           | Embedding = Giving them a book to reference information.
           | 
           | Just like a human, memory might fade a bit through the years
           | but old habits die hard. You might not perfectly recollect
           | what you learned years ago, but you still get the general
           | idea, and if you took a class on the referenced book you'll
           | be better at relaying information from it.
           | 
           | Edit: Asked ChatGPT to create the analogy.
           | 
           | A language model is like an intelligent person.
           | 
           | - Pre-training is their broad education and general
           | knowledge.
           | 
           | - Fine-tuning is their years of specialized experience in a
           | specific field.
           | 
           | - Embedding is like giving them a comprehensive book on a
           | particular subject.
           | 
           | Just as a person gains knowledge, expertise, and specialized
           | resources, the language model develops its understanding and
           | performance through pre-training, fine-tuning, and embedding.
        
       ___________________________________________________________________
       (page generated 2023-05-25 23:00 UTC)