[HN Gopher] Understanding large language models: A cross-section...
       ___________________________________________________________________
        
       Understanding large language models: A cross-section of the
       relevant literature
        
       Author : headalgorithm
       Score  : 52 points
       Date   : 2023-04-16 13:12 UTC (9 hours ago)
        
 (HTM) web link (magazine.sebastianraschka.com)
 (TXT) w3m dump (magazine.sebastianraschka.com)
        
       | cs702 wrote:
       | This is a good intro for anyone who already has at least some
       | background in ML and wants to get up to speed on LLMs relatively
       | quickly.
       | 
       | Props to the author for giving credit to Bandanau et al (2014),
       | which I believe first proposed the concept of applying a Softmax
       | function over token scores to compute attention, setting the
       | stage for the original transformer by Vaswani et al (2017).
        
       ___________________________________________________________________
       (page generated 2023-04-16 23:00 UTC)