[HN Gopher] Show HN: less than 650 LOC trainable GPT only using ...
___________________________________________________________________
Show HN: less than 650 LOC trainable GPT only using NumPy
Author : joennlae
Score : 74 points
Date : 2023-11-17 14:34 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| cuuupid wrote:
| I think people are forgetting that transformer architectures are
| a wider field from GPT and predate GPT3 by 3+ years. Referring to
| transformer architectures using a branded commercial nomer (GPT)
| is just going to help cement OpenAI's brand exposure and soon
| regulatory capture.
|
| For comparison this would be like referring to convonets as
| Inception architectures back during the CV boom (or VGGnets
| before that)
| PartiallyTyped wrote:
| The most interesting thing in this whole saga is that decoder
| only models (aka causal transformers like GPT) are as effective
| as they are.
| tverbeure wrote:
| FWIW: the GitHub project description says "GPT-like". It's the
| title here that dropped the "like".
| jimmyl02 wrote:
| One small difference is that the GPT architecture is just the
| decoder stack of the original transformer as opposed to the
| full encoder decoder stack in the original.
|
| I agree the branding play on GPTs in general is pretty smart
| and strong from OpenAI though.
| cchance wrote:
| Honestly i feel like the fact that everyone is just calling
| LLM's GPT at this point doesn't really help OpenAI, ChatGPT
| would, but the fact is that unlike "googling" something
| became synonymous for searching on the internet, GPT !=
| OpenAI-ing something, GPT just became what people call LLM's
| it seems like lately, the fact the term isn't the name of the
| company or the full name "chatgpt-ing" sort of breaks that
| hold i feel like.
| __loam wrote:
| Regarding regulatory capture, I listened to an interview with
| Lena Khan, the current head of the FTC, and this exact thing
| came up as something regulators are worried about. I think
| regulators are aware of the danger of letting industry insiders
| regulate their own industry, so I'm hopeful for some sensible
| regulations that help promote rather than harm competition. The
| FTC also exists to prevent monopoly.
| p1esk wrote:
| I wonder how easy it would be to port this library from numpy to
| cupy.
| eslaught wrote:
| Or cuNumeric: https://developer.nvidia.com/cunumeric
| gfaure wrote:
| Nice! The README mentions `LayerNorm` is implemented here, but
| while it's in the equivalence tests with PyTorch, I don't see it
| in the implementation.
| dauertewigkeit wrote:
| It's part of the TensorLi definition where all the magic
| happens.
___________________________________________________________________
(page generated 2023-11-17 23:01 UTC)