[HN Gopher] Nemotron-4-340B
___________________________________________________________________
Nemotron-4-340B
Author : bcatanzaro
Score : 97 points
Date : 2024-06-14 16:01 UTC (7 hours ago)
(HTM) web link (blogs.nvidia.com)
(TXT) w3m dump (blogs.nvidia.com)
| observationist wrote:
| This is (possibly) a GPT-4 level dense model with an open source
| license. Nvidia has released models with issues before, but
| reports on this so far indicate it's a solid contender without
| any of the hiccups of previous releases.
|
| A 340B model should require around 700GB vram or ram to run
| inference. To train or finetune, you're looking at almost double,
| which is probably why Nvidia recommends 2xA100 nodes with 1.28TB
| vram.
|
| Jensen Huang is the king of AI summer.
| waldrews wrote:
| do you mean 1.28 TB?
| observationist wrote:
| Yes, thank you for catching that!
| throwaway_ab wrote:
| How would a server/workstation like this be setup?
|
| I thought you could only use the vram on the GPU, so for 700GB
| you would need 8-9 A100 nodes as 2 only gives 160GB.
|
| I've been trying to figure out how to build a local system to
| run inference and train on top of LLM models, I thought there
| was no way to add vram to a system outside of adding more and
| more GPU's or use system ram (DDR5) even though that would be
| considerably slower.
| toshinoriyagi wrote:
| An A100 node has 8 A100s in it, each with 80GB, which is how
| they got the 1.28TB number 2 * (80 * 8).
| samspenc wrote:
| I wonder if the open-source LLM community understands what just
| happened here - we finally got a truly large LLM (a whopping
| 340B!) but it costs ... $15K per A100 x 16 GPUs = a minimum of
| $240K to just get started. Probably closer to $500K or half a
| million dollars once you factor in space, power, cooling,
| infrastructure etc.
| lhl wrote:
| You could probably run it as a Q4 (definitely as a Q3) on 4 x
| A6000 (so on a $25K workstation), although you'd probably
| also be looking about 3-4 tok/s text generation. I do think
| that it's a big landmark to have a true GPT4-class model
| (with some questionable RL though from my initial testing).
| The best thing about it is that it's almost certainly now the
| strongest model available for generating synthetic data
| without any licensing restrictions.
|
| Funnily enough, I don't think it's actually the most
| interesting model that Nvidia released this week. Nvidia also
| published this paper https://arxiv.org/abs/2406.07887 and
| released
| https://huggingface.co/nvidia/mamba2-hybrid-8b-3t-128k
| (Apache 2.0 licensed, to boot). It looks like it matches (and
| sometimes even edges out) Transformer performance, while
| having linear scaling for context length. Can't wait for a
| scaled up version of this.
|
| Nvidia also released a top-notch Llama3 70B SteerLM reward
| model as well (although RLHFlow/ArmoRM-Llama3-8B-v0.1 might
| still be a better choice).
| rthnbgrredf wrote:
| With CPU inference you just need a server with 1.28TB RAM. Yes,
| the inference will be super slow, but it is more realistic than
| to spend 100k+ dollars for A100 clusters with 1.28TB VRAM.
|
| One example: HP DL580 Gen8. Use the 32GB PC3L-14900L LRDIMMs
| (HP PN 715275-001; 712384-001, 708643-B21) for a maximum of
| 3TB. You can get the LRDIMMs in the $32-$45 range on the
| second-hand market.
| option wrote:
| 3 models are included: base, instruct, and reward. All under
| license permitting synthetic data generation and commercial use.
| diggan wrote:
| The "open" and "permissive" license has an interesting section on
| "AI Ethics":
|
| > AI Ethics. NVIDIA is committed to safety, trust and
| transparency in AI development. NVIDIA encourages You to (a)
| ensure that the product or service You develop, use, offer as a
| service or distributes meets the legal and ethical requirements
| of the relevant industry or use case, (b) take reasonable
| measures to address unintended bias and to mitigate harm to
| others, including underrepresented or vulnerable groups, and (c)
| inform users of the nature and limitations of the product or
| service. NVIDIA expressly prohibits the use of its products or
| services for any purpose in violation of applicable law or
| regulation, including but not limited to (a) illegal
| surveillance, (b) illegal collection or processing of biometric
| information without the consent of the subject where required
| under applicable law, or (c) illegal harassment, abuse,
| threatening or bullying of individuals or groups of individuals
| or intentionally misleading or deceiving others
|
| https://developer.download.nvidia.com/licenses/nvidia-open-m...
|
| Besides limiting the freedom of use (making it less "open" in my
| eyes), it's interesting that they tell you to meet "ethical
| requirements of the relevant industry or use case". Seems like
| that'd be super hard to pin down in a precise way.
| jerbear4328 wrote:
| I read that as "NVIDIA _encourages_ you to be ethical and
| _prohibits_ breaking the law. That doesn 't seem so bad to me.
| What is bad, however, is section 2.1.
|
| > 2.1 ... If You institute ... litigation against any entity
| ... alleging that the Model or a Derivative Model constitutes
| direct or contributory copyright or patent infringement, then
| any licenses granted to You under this Agreement for that Model
| or Derivative Model will terminate...
|
| If you sue or file a copyright claim that the model violates
| copyright, you lose your license to use the model. That's a
| really weird restriction, I'm not sure what the point is.
| bcatanzaro wrote:
| The point is: if you sue claiming this model breaks the law,
| you lose your license to use it.
|
| Apache 2.0 has a similar restriction: " If You institute
| patent litigation against any entity (including a cross-claim
| or counterclaim in a lawsuit) alleging that the Work or a
| Contribution incorporated within the Work constitutes direct
| or contributory patent infringement, then any patent licenses
| granted to You under this License for that Work shall
| terminate as of the date such litigation is filed."
| jerbear4328 wrote:
| Oh, I didn't realize that it was a standard term. I'm sure
| there's a good motivation then, it doesn't seem so bad.
| orra wrote:
| True, although it's unusual to see it for copyright not
| patents.
|
| That said, the far bigger issue is the end of the same
| clause 2.1:
|
| > NVIDIA may update this Agreement to comply with legal and
| regulatory requirements at any time and You agree to either
| comply with any updated license or cease Your copying, use,
| and distribution of the Model and any Derivative Model
| sebzim4500 wrote:
| Sounds reasonable to me. If you are going to claim in court
| the the model is illegal then why exactly are you using it?
| abdullahkhalids wrote:
| It's good they have included this clause, despite it being
| difficult to legally pin down. Hopefully, there will be a
| lawsuit at some point which will create some ethical boundaries
| that AI developers and users much not cross.
| imglorp wrote:
| Very weaselly worded. Some things that appear to be allowed:
| * intended bias * legal surveillance * legal
| collection of biometrics without consent * legal
| harrassment
|
| Ie, state sanctioned killbots are just fine!
| telotortium wrote:
| No copyright license is going to stop a state from using the
| model for the military use that they really need. First of
| all, I'm pretty sure most countries have laws allowing the
| state to ignore copyright in the case of national defense.
| More importantly, power does what it wants and what it can
| get away with.
| IncreasePosts wrote:
| It says "NVIDIA encourages You to..."
|
| Which, in terms of a contract, means absolutely nothing at all.
| mushufasa wrote:
| Google famously removed "don't be evil" because lawyers
| pushed back on who gets to define evil. I can imagine same
| logic applies here: Nvidia isn't about to define objective
| morality, so the best alternative is to ask people to try
| their best.
| brianshaler wrote:
| I'm not sure what business GP is in, but being encouraged not
| to be unethical and explicitly forbidding illegal activity
| doesn't seem like much of an infringement on one's freedom
| more than the applicable laws. I guess being arrested for
| crimes is one thing, but having a license revoked on top of
| that is just one step too far?
| ilaksh wrote:
| Has anyone runs evaluations to compare the instruct version with
| gpt-4o or llama3-70b etc.? It's so much larger than the leading
| open source models. So one would hope it would perform
| significantly better?
|
| Or is this in one of the chat arenas or whatever? Very curious to
| see some numbers related to the performance.
|
| But if it's at least somewhat better than the existing open
| source models then that is a big boost for open source training
| and other use cases.
| rllearneratwork wrote:
| this is june-chatbot model currently running on chatbot arena
| from lmsys
| Something1234 wrote:
| What is it? Is it an llms or what?
| danielhanchen wrote:
| Oh NVIDIA released an open weights 340 billion parameter LLM!
|
| It should be the biggest open weights to date I think (Grok
| 314b).
|
| It's trained on 8 trillion tokens, and some benchmarks show it
| does better than or equal to GPT-4o!
|
| They released 3 checkpoints - the base, the instruct and a
| reward aligned model.
|
| See
| https://huggingface.co/collections/nvidia/nemotron-4-340b-66...
| for all the checkpoints
| bguberfain wrote:
| "Nemotron-4-340B-Instruct is a chat model intended for use for
| the English language" - frustrating
| vosper wrote:
| Why does nvidia release models that compete with its customers
| businesses but don't make any money for nvidia?
|
| Are they commodotising their complements?
| vineyardmike wrote:
| > [commoditizing] their complements
|
| That's exactly what this would be.
|
| > compete with its customers businesses
|
| I suspect most of their business comes from a few massive
| corporate spenders, not a "long tail" of smaller businesses, so
| it seems like a questionable goal to disrupt those customers
| without a clear path to _new_ customers. Then again, few have
| the resources to run this model, so I guess this just ensures
| that their big customers are all working with some floor in
| model size? Probably won 't impact anything realistically.
| logicchains wrote:
| They target this model at generating synthetic data. Data is
| the lifeblood of LLM training; quality synthetic data means
| more training can occur which means more demand for GPUs.
| WithinReason wrote:
| The model is big enough that you need expensive Nvidia GPUs to
| run it effectively
| WithinReason wrote:
| "...and were sized to fit on a single DGX H100 with 8 GPUs when
| deployed in FP8 precision"
|
| OK I see the goal is to sell more H100s, they made it big enough
| so it's not compatible with a cheaper GPU
| vineyardmike wrote:
| > The Nemotron-4 340B family includes base, instruct and reward
| models that form a pipeline to generate synthetic data used for
| training and refining LLMs.
|
| I feel like everyone is missing this from the announcement. They
| explicitly are releasing this to help _generate synthetic
| training data_. Most big models and APIs have clauses that ban
| its use to improve other models. Sure it maybe can compete with
| other big commercial models at normal tasks, but this would be a
| huge opportunity for ML labs and startups to expand training data
| of smaller models.
|
| Nvidia must see a limit to the growth of new models (and new
| demand for training with their GPUs) based on the availability of
| training data, so they're seeking to provide a tool to bypass
| those restrictions.
|
| All for the low price of 2x A100s...
| logicchains wrote:
| >They explicitly are releasing this to help generate synthetic
| training data
|
| Synthetic training data is basically free money for NVidia;
| there's only a fixed amount of high-quality original data
| around, but there's a potential for essentially infinite
| synthetic data, and more data means more training hours means
| more GPU demand.
| jsheard wrote:
| > Most big models and APIs have clauses that ban its use to
| improve other models.
|
| I will never get over the gall of anything and everything being
| deemed fair game to use as training data for a model, _except_
| you 're not allowed to use the output of a model to train your
| own model without permission, because model output has some
| kind of exclusive super-copyright apparently.
| vineyardmike wrote:
| > because model output has some kind of exclusive super-
| copyright apparently
|
| Well, its not copyright that is being used to forbid this,
| its _terms of service_ , but yea, it is quite a hypocrisy.
| cyanydeez wrote:
| GIGOaaS
___________________________________________________________________
(page generated 2024-06-14 23:02 UTC)