Post AUukXWbYfag0TdkqBc by TedUnderwood@sigmoid.social
(DIR) More posts by TedUnderwood@sigmoid.social
(DIR) Post #AUukXWbYfag0TdkqBc by TedUnderwood@sigmoid.social
2023-04-22T15:31:05Z
0 likes, 1 repeats
Absolutely revelatory piece from Yoav Goldberg casting light on an overlooked puzzle about last year: why did we need *reinforcement* learning (RLHF) to unlock the potential of language models? Why wasn’t supervised learning enough? #LLM #AI https://gist.github.com/yoavg/6bff0fecd65950898eba1bb321cfbd81
(DIR) Post #AUukvNqasiV2GZAESW by compthink@sigmoid.social
2023-04-22T15:35:23Z
0 likes, 0 repeats
@TedUnderwood sounds interesting, I'll definitely have a read of it. I know that openai used RL for the coaching/guardrails creation part, but not aware of how else it was pivotal yet.
(DIR) Post #AUulDm8PJw7InwGYUq by TedUnderwood@sigmoid.social
2023-04-22T15:38:42Z
0 likes, 0 repeats
@compthink I’ll give away a little of Yoav’s answer, because it’s kind of fascinating philosophically. What we needed to teach the models was, when to say “I don’t know.” And that’s not something we could teach by demonstration alone, because … we don’t know how much the model knows! Had to be dialogic.
(DIR) Post #AUuomGjcM8M9PwoIMq by scott_bot@hcommons.social
2023-04-22T16:18:33Z
0 likes, 0 repeats
@TedUnderwood I really look forward to a whole world made of stuff we built from scratch that we barely understand, so we have to construct hypothetical explanations for their mechanisms in lieu of direct observational evidence. Super stoked about all the epicycles, equants, and deferents we're going to be discovering over the next 500 years, now that we've finally mostly dealt with those pesky physical laws.
(DIR) Post #AUurHqPFynYOw7TEpc by TedUnderwood@sigmoid.social
2023-04-22T16:46:41Z
0 likes, 0 repeats
@scott_bot Me too! A man’s reach should exceed his grasp, or what’s a heaven for?
(DIR) Post #AUus4UJAvckAadoy4u by scott_bot@hcommons.social
2023-04-22T16:55:26Z
0 likes, 0 repeats
@TedUnderwood You know the last time that poem was quoted in the mainstream media, it ended in Wolverine creating an army of (artificial?) sentient beings and then drowning them in order to defeat Batman.
(DIR) Post #AUuu6wcOqKVosb71VY by TedUnderwood@sigmoid.social
2023-04-22T17:18:20Z
0 likes, 0 repeats
@scott_bot If Byronic Romanticism was a drug advertised on tv, they would have to read quickly through a lot of fine print.
(DIR) Post #AUuvblo5BqNUHNq7DE by afamiglietti79@mastodon.social
2023-04-22T17:35:04Z
0 likes, 0 repeats
@TedUnderwood @scott_bot I know just enough about compilers to know this is just gonna be one more layer on the vast pile labeled "things people created but don't really understand"
(DIR) Post #AUvNOACAVzCma6PDe4 by scott_bot@hcommons.social
2023-04-22T22:46:21Z
0 likes, 0 repeats
@TedUnderwood Anyway coming back to this, this is pretty squarely leaking into "thou shalt not make unto thee any graven image, or any likeness of any thing that is in heaven above" territory, which is my cue to back away slowly.
(DIR) Post #AUvm67vygGtdD5KSDQ by TedUnderwood@sigmoid.social
2023-04-23T03:23:15Z
0 likes, 0 repeats
@scott_bot 19c poets and mass scientists forgot to sign that pledge I’m afraid