[HN Gopher] Large models of what? Mistaking engineering achievem...
___________________________________________________________________
Large models of what? Mistaking engineering achievements for
linguistic agency
Author : Anon84
Score : 44 points
Date : 2024-07-16 10:54 UTC (4 days ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| mnkv wrote:
| Good summary of some of the main "theoretical" criticism of LLMs
| but I feel that it's a bit dated and ignores the recent trend of
| iterative post-training, especially with human feedback. Major
| chatbots are no doubt being iteratively refined on the feedback
| from users i.e. interaction feedback, RLHF, RLAIF. So ChatGPT
| could fall within the sort of "enactive" perspective on language
| and definitely goes beyond the issues of static datasets and data
| completeness.
|
| Sidenote: the authors make a mistake when citing Wittgenstein to
| find similarity between humans and LLMs. Language modelling on a
| static dataset is mostly _not_ a language game (see Bender and
| Koller 's section on distributional semantics and caveats on
| learning meaning from "control codes")
| dartos wrote:
| FWIW even more recently, models have been tuned using a method
| called DPO instead of RLHF.
|
| IIRC DPO doesn't have human feedback in the loop
| valec wrote:
| it does. that's what the "direct preference" part of DPO
| means. you just avoid training an explicit reward model on it
| like in rlhf and instead directly optimize for log
| probability of preferred vs dispreferred responses
| mistrial9 wrote:
| oh what a kettle of worms here... Now the mind must consider
| "repetitive speech under pressure and in formal situations" in
| contrast and comparison to "limited mechanical ability to produce
| grammatic sequences of well-known words" .. where is the boundary
| there?
|
| I am a fan of this paper, warts and all ! (and the paper summary
| paragraph contained some atrocious grammar btw)
| Animats wrote:
| Full paper: [1].
|
| Not much new here. The basic criticism is that LLMs are not
| embodied; they have no interaction with the real world. The same
| criticism can be applied to most office work.
|
| Useful insight: "We (humans) are always doing more than one
| thing." This is in the sense of language output having goals for
| the speaker, not just delivering information. This is related to
| the problem of LLMs losing the thread of a conversation. Probably
| the only reasonably new concept in this paper.
|
| Standard rant: "Humans are not brains that exist in a vat..."
|
| "LLMs ... have nothing at stake." Arguable, in that some LLMs are
| trained using punishment. Which seems to have strong side
| effects. The undesirable behavior is suppressed, but so is much
| other behavior. That's rather human-like.
|
| "LLMs Don't Algospeak". The author means using word choices to
| get past dumb censorship algorithms. That's probably do-able, if
| anybody cares.
|
| [1] https://arxiv.org/pdf/2407.08790
| KHRZ wrote:
| That's a lot of thinking they've done about LLMs, but how much
| did they actually try LLMs? I have long threads where ChatGPT
| refine solutions to coding problems. Their example of losing the
| thread after printing a tiny list of 10 philosophers seems really
| outdated. Also it seems LLMs utilize nested contexts as well, for
| example when it can break it' own rules while telling a story or
| speaking hypothetically.
___________________________________________________________________
(page generated 2024-07-20 23:01 UTC)