[HN Gopher] DeepSpeed Chat: Easy, fast and affordable RLHF train...
___________________________________________________________________
DeepSpeed Chat: Easy, fast and affordable RLHF training of ChatGPT-
like models
Author : quantisan
Score : 40 points
Date : 2023-04-12 21:48 UTC (1 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| teruakohatu wrote:
| Does the RLHF help with training a LLM model to produce better
| (more accurate) results for a particular problem domain (eg.
| Customer support for a particular company) or is it only helpful
| in training the LLM to be a chat agent in general or a chat agent
| with guard rails?
| brofallon wrote:
| To use RLHF you need a dataset that includes instructions with
| good & bad answers - do many of those exist? I know there are a
| few datasets of just plain instructions-with-responses, but I'm
| not aware of any that have both good and bad (or ranked)
| responses. Is that trivial, or an important missing element here?
| sdenton4 wrote:
| All of the UX interface have little up/down thumb icons...
| that's where the boolean feedback comes from. If people stop
| using that, sentiment analysis on the human responses will
| likely go a long way.
| summarity wrote:
| Also see the example repo README:
| https://github.com/microsoft/DeepSpeedExamples/tree/master/a...
|
| > With just one click, you can train, generate and serve a 1.3
| billion parameter ChatGPT model within 1.36 hours on a single
| consumer-grade NVIDIA A6000 GPU with 48GB memory. On a single DGX
| node with 8 NVIDIA A100-40G GPUs, DeepSpeed-Chat enables training
| for a 13 billion parameter ChatGPT model in 13.6 hours. On multi-
| GPU multi-node systems (cloud scenarios),i.e., 8 DGX nodes with 8
| NVIDIA A100 GPUs/node, DeepSpeed-Chat can train a 66 billion
| parameter ChatGPT model under 9 hours. Finally, it enables 15X
| faster training over the existing RLHF systems
|
| > The following are some of the open-source examples that are
| powered by DeepSpeed: Databricks Dolly, LMFlow, CarperAI-TRLX,
| Huggingface-PEFT
|
| (disclaimer: MSFT/GH employee, not affiliated with this project)
| nacs wrote:
| > single consumer-grade NVIDIA A6000 GPU with 48GB memory
|
| I wouldn't call an A6000 "consumer-grade" -- it's about $5000.
|
| Top of the line consumer grade GPU would be a Nvidia RTX
| 4090/3090 with 24GB VRAM.
| tinco wrote:
| Microsoft: invests 10 billion in company. Also Microsoft: here's
| the tools you need to DIY one of the premium features the company
| we just invested 10 billion in for free.
|
| Not that reproducing GPT-4 is going to be easy with this, but
| it'll definitely get rid of some major hurdles. I read a report
| about the difficulties HuggingFace had with producing their Bloom
| model, and a lot of it was the sort of straight forward systems
| engineering that goes into tooling like this.
|
| Is the Bloom model considered a failure by the community? If you
| read the introduction it was supposed to include improvements
| over GPT3, but it performs much worse, I guess because of lower
| quality training data? I wonder what sort of company would have
| high enough quality data that they could use this project to fine
| tune a public model to the point where it would be better in some
| scenario than plain old GPT4 would be. Especially when you can
| just inject extra info in to the GPT4 prompt, like phind does for
| example. What even is the use of fine tuning given GPT 4 exists?
___________________________________________________________________
(page generated 2023-04-12 23:00 UTC)