hngopher.com

       [HN Gopher] DeepDive in everything of Llama3: revealing detailed...
       ___________________________________________________________________
        
       DeepDive in everything of Llama3: revealing detailed insights and
       implementation
        
       Author : therealoliver
       Score  : 202 points
       Date   : 2025-02-21 16:57 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | kevmo314 wrote:
       | I like the use of the functional API here. I learned through a
       | similar route and it was very helpful for me compared to trying
       | to understand `torch.nn.Module`.
       | 
       | Here's a gist of my learning path if it's helpful to anyone:
       | https://gist.github.com/kevmo314/294001659324429bae6749062a9...
        
         | therealoliver wrote:
         | Yes, these are two different learning paths. The detailed
         | process learning is beneficial for future research, while the
         | API-style approach is convenient and quick for getting started
         | and using. Both are very useful!
        
       | simonw wrote:
       | I hadn't realized OpenAI's tiktoken Python library could work
       | with other models outside of the OpenAI family, that's really
       | useful: https://github.com/therealoliver/Deepdive-llama3-from-
       | scratc...
        
         | therealoliver wrote:
         | I'm glad to have helped you :)
        
         | moffkalast wrote:
         | It's more than just that, practically every notable open model
         | released in the past year or so uses tiktoken as the tokenizer.
        
       | aghilmort wrote:
       | great need; mulling over; shows up all the time in AI paradigms
        
         | therealoliver wrote:
         | glad to have helped you :)
        
           | aghilmort wrote:
           | just realized Siri typo'd meant to say great read
        
       ___________________________________________________________________
       (page generated 2025-02-22 23:01 UTC)