[HN Gopher] Understanding large language models: A cross-section...
___________________________________________________________________
Understanding large language models: A cross-section of the
relevant literature
Author : headalgorithm
Score : 52 points
Date : 2023-04-16 13:12 UTC (9 hours ago)
(HTM) web link (magazine.sebastianraschka.com)
(TXT) w3m dump (magazine.sebastianraschka.com)
| cs702 wrote:
| This is a good intro for anyone who already has at least some
| background in ML and wants to get up to speed on LLMs relatively
| quickly.
|
| Props to the author for giving credit to Bandanau et al (2014),
| which I believe first proposed the concept of applying a Softmax
| function over token scores to compute attention, setting the
| stage for the original transformer by Vaswani et al (2017).
___________________________________________________________________
(page generated 2023-04-16 23:00 UTC)