[HN Gopher] DeepDive in everything of Llama3: revealing detailed...
___________________________________________________________________
DeepDive in everything of Llama3: revealing detailed insights and
implementation
Author : therealoliver
Score : 202 points
Date : 2025-02-21 16:57 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| kevmo314 wrote:
| I like the use of the functional API here. I learned through a
| similar route and it was very helpful for me compared to trying
| to understand `torch.nn.Module`.
|
| Here's a gist of my learning path if it's helpful to anyone:
| https://gist.github.com/kevmo314/294001659324429bae6749062a9...
| therealoliver wrote:
| Yes, these are two different learning paths. The detailed
| process learning is beneficial for future research, while the
| API-style approach is convenient and quick for getting started
| and using. Both are very useful!
| simonw wrote:
| I hadn't realized OpenAI's tiktoken Python library could work
| with other models outside of the OpenAI family, that's really
| useful: https://github.com/therealoliver/Deepdive-llama3-from-
| scratc...
| therealoliver wrote:
| I'm glad to have helped you :)
| moffkalast wrote:
| It's more than just that, practically every notable open model
| released in the past year or so uses tiktoken as the tokenizer.
| aghilmort wrote:
| great need; mulling over; shows up all the time in AI paradigms
| therealoliver wrote:
| glad to have helped you :)
| aghilmort wrote:
| just realized Siri typo'd meant to say great read
___________________________________________________________________
(page generated 2025-02-22 23:01 UTC)