[HN Gopher] Large models of what? Mistaking engineering achievem...
       ___________________________________________________________________
        
       Large models of what? Mistaking engineering achievements for
       linguistic agency
        
       Author : Anon84
       Score  : 44 points
       Date   : 2024-07-16 10:54 UTC (4 days ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | mnkv wrote:
       | Good summary of some of the main "theoretical" criticism of LLMs
       | but I feel that it's a bit dated and ignores the recent trend of
       | iterative post-training, especially with human feedback. Major
       | chatbots are no doubt being iteratively refined on the feedback
       | from users i.e. interaction feedback, RLHF, RLAIF. So ChatGPT
       | could fall within the sort of "enactive" perspective on language
       | and definitely goes beyond the issues of static datasets and data
       | completeness.
       | 
       | Sidenote: the authors make a mistake when citing Wittgenstein to
       | find similarity between humans and LLMs. Language modelling on a
       | static dataset is mostly _not_ a language game (see Bender and
       | Koller 's section on distributional semantics and caveats on
       | learning meaning from "control codes")
        
         | dartos wrote:
         | FWIW even more recently, models have been tuned using a method
         | called DPO instead of RLHF.
         | 
         | IIRC DPO doesn't have human feedback in the loop
        
           | valec wrote:
           | it does. that's what the "direct preference" part of DPO
           | means. you just avoid training an explicit reward model on it
           | like in rlhf and instead directly optimize for log
           | probability of preferred vs dispreferred responses
        
       | mistrial9 wrote:
       | oh what a kettle of worms here... Now the mind must consider
       | "repetitive speech under pressure and in formal situations" in
       | contrast and comparison to "limited mechanical ability to produce
       | grammatic sequences of well-known words" .. where is the boundary
       | there?
       | 
       | I am a fan of this paper, warts and all ! (and the paper summary
       | paragraph contained some atrocious grammar btw)
        
       | Animats wrote:
       | Full paper: [1].
       | 
       | Not much new here. The basic criticism is that LLMs are not
       | embodied; they have no interaction with the real world. The same
       | criticism can be applied to most office work.
       | 
       | Useful insight: "We (humans) are always doing more than one
       | thing." This is in the sense of language output having goals for
       | the speaker, not just delivering information. This is related to
       | the problem of LLMs losing the thread of a conversation. Probably
       | the only reasonably new concept in this paper.
       | 
       | Standard rant: "Humans are not brains that exist in a vat..."
       | 
       | "LLMs ... have nothing at stake." Arguable, in that some LLMs are
       | trained using punishment. Which seems to have strong side
       | effects. The undesirable behavior is suppressed, but so is much
       | other behavior. That's rather human-like.
       | 
       | "LLMs Don't Algospeak". The author means using word choices to
       | get past dumb censorship algorithms. That's probably do-able, if
       | anybody cares.
       | 
       | [1] https://arxiv.org/pdf/2407.08790
        
       | KHRZ wrote:
       | That's a lot of thinking they've done about LLMs, but how much
       | did they actually try LLMs? I have long threads where ChatGPT
       | refine solutions to coding problems. Their example of losing the
       | thread after printing a tiny list of 10 philosophers seems really
       | outdated. Also it seems LLMs utilize nested contexts as well, for
       | example when it can break it' own rules while telling a story or
       | speaking hypothetically.
        
       ___________________________________________________________________
       (page generated 2024-07-20 23:01 UTC)