[HN Gopher] YaFSDP: a sharded data parallelism framework, faster...
___________________________________________________________________
YaFSDP: a sharded data parallelism framework, faster for pre-
training LLMs
Author : wiradikusuma
Score : 122 points
Date : 2024-06-18 11:54 UTC (11 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| dayeye2006 wrote:
| Any idea on what are the main tricks used to achieve gains over
| fsdp?
| albertzeyer wrote:
| The blog post seems to contain more details and the core ideas:
| https://medium.com/yandex/yafsdp-a-tool-for-faster-llm-train...
| az226 wrote:
| Odd that they don't expand on this:
|
| In Yandex's pre-trainings, the implementation of YaFSDP along
| with other memory optimization strategies resulted in a speed
| gain of 45%.
| codetrotter wrote:
| I was surprised to see that the Ya part meant "Yet another". I
| mean, I've seen it before in many acronyms. But it's pretty
| tongue in cheek of them to do that here since one would expect it
| was just because it was made by Yandex.
| shadow28 wrote:
| Doesn't Yandex itself come from "Yet Another Indexer"?
| codetrotter wrote:
| Ah, so it does as well! I only knew that it was a portmanteau
| of "Ia" and "index". As in "I index". Which it also is.
| alexey-salmin wrote:
| There's a third explanation of "Iandex" being "iazykovoi
| indeks" i.e. "language-aware index". Russian language have
| complicated morphology with three genders and six
| grammatical cases, somewhat similar to Latin. Searching by
| an exact word-match almost never gives good results, and
| neither Yahoo nor AltaVista could offer any better in 1997
| -- hence Yandex was built.
| aristus wrote:
| You mean, Yet Another Human-Organized Ontology?
| mikrl wrote:
| I was expecting it to be a Russian acronym starting with the
| letter Ia which is pronounced Ya. It acquired its backward R
| glyph when it was changed from an old Slavic letter I cannot
| draw.
| deaddodo wrote:
| What do you mean that you "cannot draw" them? This is a
| digital medium and both (well one, the other is half
| supported) variants are valid Unicode glyphs:
|
| / E
|
| Or do you mean you literally can't draw them?
| Tade0 wrote:
| It's an idiomatic expression in slavic languages,
| indicating that the shape is particularly complex.
| mikrl wrote:
| I could not reproduce it by hand without a reference nor do
| I have a keyboard installed which offers it as a symbol,
| nor was I going to look it up to add the spice to a shower
| thought tier HN comment.
|
| I see you have provided it, making it more accessible for
| my future use, at least on the timeframe of this thread
| being in my recent HN activity.
___________________________________________________________________
(page generated 2024-06-18 23:00 UTC)