[HN Gopher] Large models of what? Mistaking engineering achievem...
       ___________________________________________________________________
        
       Large models of what? Mistaking engineering achievements for
       linguistic agency
        
       Author : Anon84
       Score  : 182 points
       Date   : 2024-07-16 10:54 UTC (5 days ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | mnkv wrote:
       | Good summary of some of the main "theoretical" criticism of LLMs
       | but I feel that it's a bit dated and ignores the recent trend of
       | iterative post-training, especially with human feedback. Major
       | chatbots are no doubt being iteratively refined on the feedback
       | from users i.e. interaction feedback, RLHF, RLAIF. So ChatGPT
       | could fall within the sort of "enactive" perspective on language
       | and definitely goes beyond the issues of static datasets and data
       | completeness.
       | 
       | Sidenote: the authors make a mistake when citing Wittgenstein to
       | find similarity between humans and LLMs. Language modelling on a
       | static dataset is mostly _not_ a language game (see Bender and
       | Koller 's section on distributional semantics and caveats on
       | learning meaning from "control codes")
        
         | dartos wrote:
         | FWIW even more recently, models have been tuned using a method
         | called DPO instead of RLHF.
         | 
         | IIRC DPO doesn't have human feedback in the loop
        
           | valec wrote:
           | it does. that's what the "direct preference" part of DPO
           | means. you just avoid training an explicit reward model on it
           | like in rlhf and instead directly optimize for log
           | probability of preferred vs dispreferred responses
        
             | meroes wrote:
             | What is it called when humans interact with a model through
             | lengthy exchanges (mostly humans correcting the model's
             | responses to a posed question to the model, mostly through
             | chat and labeling each statement by the model as correct or
             | not), and then all of that text (possibly with some
             | editing) is fed to another model to train that higher
             | model?
             | 
             | Does this have a specific name?
        
               | dartos wrote:
               | I don't think that process has a specific name. It's just
               | how training these models works.
               | 
               | Conversations you have with like chatgpt are likely
               | stored, then sorted through somehow, then added to an
               | ever growing dataset of conversations that would be used
               | to train entirely new models.
        
           | hackernewds wrote:
           | DPO most essentially has human feedback, depends on what the
           | preference optimizations are
        
       | mistrial9 wrote:
       | oh what a kettle of worms here... Now the mind must consider
       | "repetitive speech under pressure and in formal situations" in
       | contrast and comparison to "limited mechanical ability to produce
       | grammatic sequences of well-known words" .. where is the boundary
       | there?
       | 
       | I am a fan of this paper, warts and all ! (and the paper summary
       | paragraph contained some atrocious grammar btw)
        
       | Animats wrote:
       | Full paper: [1].
       | 
       | Not much new here. The basic criticism is that LLMs are not
       | embodied; they have no interaction with the real world. The same
       | criticism can be applied to most office work.
       | 
       | Useful insight: "We (humans) are always doing more than one
       | thing." This is in the sense of language output having goals for
       | the speaker, not just delivering information. This is related to
       | the problem of LLMs losing the thread of a conversation. Probably
       | the only reasonably new concept in this paper.
       | 
       | Standard rant: "Humans are not brains that exist in a vat..."
       | 
       | "LLMs ... have nothing at stake." Arguable, in that some LLMs are
       | trained using punishment. Which seems to have strong side
       | effects. The undesirable behavior is suppressed, but so is much
       | other behavior. That's rather human-like.
       | 
       | "LLMs Don't Algospeak". The author means using word choices to
       | get past dumb censorship algorithms. That's probably do-able, if
       | anybody cares.
       | 
       | [1] https://arxiv.org/pdf/2407.08790
        
         | ainoobler wrote:
         | The optimization process adjusts the weights of a computational
         | graph until the numeric outputs align with some baseline
         | statistics of a large data set. There is no "punishment" or
         | "reward", gradient descent isn't even necessary as there are
         | methods for modifying the weights in other ways and the
         | optimization still converges to a desired distribution which
         | people claim is "intelligent".
         | 
         | The converse is that people are "just" statistical
         | distributions of the signals produced by them but I don't know
         | if there are people who claim they are nothing more than
         | statistical distributions.
         | 
         | I think people are confused because they do not really
         | understand how software and computers work. I'd say they should
         | learn some computability theory to gain some clarity but I
         | doubt they'd listen.
        
           | bubblyworld wrote:
           | If you really want to phrase it that way, organisms like us
           | are "just" distributions of genes that have been pushed this
           | way and that by natural selection until they converged to
           | something we consider intelligent (humans).
           | 
           | It's pretty clear that these optimisation processes lead to
           | emergent behaviour, both in ML and in the natural sciences.
           | Computability theory isn't really relevant here.
        
             | ainoobler wrote:
             | I don't even know where to begin to address your confusion.
             | Without computability theory there are no computers, no
             | operating systems, no networks, no compilers, and no high
             | level frameworks for "AI".
        
               | bubblyworld wrote:
               | Well, if you want to address my "confusion" then pick
               | something and start there =)
               | 
               | That is patently false - most of those things are firmly
               | in the realm of engineering, especially these days.
               | Mathematics is good for grounding intuition though. But
               | why is this relevant to the OP?
        
               | ainoobler wrote:
               | There is no reason to do any of that because according to
               | your own logic AI can do all of it. You really should sit
               | down and ponder what exactly you get out of equating
               | Turing machines with human intelligence.
        
               | bubblyworld wrote:
               | Sorry, I edited my reply because I decided going down
               | that rabbit hole wasn't worth it. Didn't expect you to
               | reply immediately.
               | 
               | I'm not equating anything here, just pointing out that
               | the fact that AI runs in software isn't a knockdown
               | argument against anything. And computability theory
               | certainly has nothing useful to say in that regard.
        
               | ainoobler wrote:
               | Right.
        
               | bubblyworld wrote:
               | Well, you know, elaborate and we can have a productive
               | discussion. The way you keep appealing to computability
               | theory as a black box makes me think you haven't actually
               | studied that much of it.
        
               | ainoobler wrote:
               | Not much to discuss.
        
       | KHRZ wrote:
       | That's a lot of thinking they've done about LLMs, but how much
       | did they actually try LLMs? I have long threads where ChatGPT
       | refine solutions to coding problems. Their example of losing the
       | thread after printing a tiny list of 10 philosophers seems really
       | outdated. Also it seems LLMs utilize nested contexts as well, for
       | example when it can break it' own rules while telling a story or
       | speaking hypothetically.
        
         | tkgally wrote:
         | For a paper submitted on July 11, 2024, and with several
         | references to other 2024 publications, it is indeed strange
         | that it gives ChatGPT output from April 2023 to demonstrate
         | that "LLMs lose the thread of a conversation with inhuman ease,
         | as outputs are generated in response to prompts rather than a
         | consistent, shared dialogue" (Figure 1). I have had many
         | consistent, shared dialogues with recent versions of ChatGPT
         | and Claude without any loss of conversation thread even after
         | many back-and-forths.
        
         | Der_Einzige wrote:
         | Most LLM critics (and singularity-is-near influencers) don't
         | actually use the systems enough to have relevant opinions about
         | them. The only really good sources of truth is the chatbot-
         | arena from lmsys and the comment section of r/localllama (I'm
         | quoting Karpathy), both are "wisdom of the crowd" and often the
         | crowd on r/localllama is getting that wisdom by spending hours
         | with one hand on the keyboard and another under their clothes.
        
       | GeneralMayhem wrote:
       | I am highly skeptical of LLMs as a mechanism to achieve AGI, but
       | I also find this paper fairly unconvincing, bordering on
       | tautological. I feel similarly about this as to what I've read of
       | Chalmers - I agree with pretty much all of the conclusions, but I
       | don't feel like the text would convince me of those conclusions
       | if I disagreed; it's more like it's showing me ways of explaining
       | or illustrating what I already believed.
       | 
       | On embodiment - yes, LLMs do not have corporeal experience. But
       | it's not obvious that this means that they cannot, a priori, have
       | an "internal" concept of reality, or that it's impossible to gain
       | such an understanding from text. The argument feels circular:
       | LLMs are similar to a fake "video game" world because they aren't
       | real people - therefore, it's wrong to think that they could be
       | real people? And the other half of the argument is that because
       | LLMs can only see text, they're missing out on the wider world of
       | non-textual communication; but then, does that mean that human
       | writing is not "real" language? This argument feels especially
       | weak in the face of multi-modal models that are in fact able to
       | "see" and "hear".
       | 
       | The other flavor of argument here is that LLM behavior is
       | empirically non-human - e.g., the argument about not asking for
       | clarification. But that only means that they aren't _currently_
       | matching humans, not that they _couldn 't_.
       | 
       | Basically all of these arguments feel like they fall down to the
       | strongest counterargument I see proposed by LLM-believers, which
       | is that sufficiently advanced mimicry is not only
       | indistinguishable from the real thing, but at the limit in fact
       | _is_ the real thing. If we say that it 's impossible to have true
       | language skills without implicitly having a representation of
       | self and environment, and then we see an entity with what appears
       | to be true language skills, we should conclude that that entity
       | must contain within it a representation of self and environment.
       | That argument doesn't rely on any assumptions about the mechanism
       | of representation other than a reliance on physicalism. Looking
       | at it from the other direction, if you assume that all that it
       | means to "be human" is encapsulated in the entropy of a human
       | body, then that concept is necessarily describable with finite
       | entropy. Therefore, by extension, there must be some number of
       | parameters and some model architecture that completely encode
       | that entropy. Questions like whether LLMs are the perfect
       | architecture or whether the number of parameters required is a
       | number that can be practically stored on human-manufacturable
       | media are _engineering_ questions, not philosophical ones: finite
       | problems admit finite solutions, full stop.
       | 
       | Again, that conclusion _feels_ wrong to me... but if I 'm being
       | honest with myself, I can't point to why, other than to point at
       | some form of dualism or spirituality as the escape hatch.
        
         | abernard1 wrote:
         | > LLMs do not have corporeal experience. But it's not obvious
         | that this means that they cannot, a priori, have an "internal"
         | concept of reality, or that it's impossible to gain such an
         | understanding from text.
         | 
         | I would argue it is (obviously) impossible the way the current
         | implementation of models work.
         | 
         | How could a system which produces a single next word based upon
         | a likelihood and and a parameter called a "temperature" have a
         | conceptual model underpinning it? Even theoretically?
         | 
         | Humans and animals have an obvious conceptual understanding of
         | the world. Before we "emit" a word or a sentence, we have an
         | idea of what we're going to say. This is obvious when talking
         | to children, who know something and have a hard time saying it.
         | Clearly, language is not the medium in which they think or
         | develop thoughts, merely an imperfect (and often humorous)
         | expression of it.
         | 
         | Not so with LLMs!! Generative LLMs do not have a prior concept
         | available before they start emitting text. That the
         | "temperature" can chaotically change the output as the tokens
         | proceed just goes to show there is no pre-existing concept to
         | reference. It looks right, and often is right, but generative
         | systems are basically _always_ hallucinating: they do not have
         | any concepts at all. That they are  "right" as often as they
         | are is a testament to the power of curve fitting and
         | compression of basis functions in high dimensionality spaces.
         | But JPEGs do the same thing, and I don't believe they have a
         | conceptual understanding of pictures.
        
           | GeneralMayhem wrote:
           | The argument would be that that conceptual model is encoded
           | in the intermediate-layer parameters of the model, in a
           | different but analogous way to how it's encoded in the graph
           | and chemical structure of your neurons.
        
             | abernard1 wrote:
             | I agree that's an argument. I would contend that argument
             | is obviously false. If it were true, LLMs could multiply
             | scalar numbers together trivially. It should be the easiest
             | thing in the world for them. The network required to do
             | that well is extremely small, the parameter sizes of these
             | models are gigantic, and the textual expression is highly
             | regular: multiplication is the simplest concept imaginable.
             | 
             | That they cannot do that basic task implies to me that they
             | have almost no conceptual understanding unless the fit is
             | almost memorizable or the space is highly regular. That
             | LLMs can't multiply numbers properly isn't surprising if
             | they don't really understand concepts prior to emitting
             | text. Where they do logical tasks, that can be done with
             | minimal or no understanding, because syllogisms and logical
             | formalisms are highly structured in text arguments.
        
               | GaggiX wrote:
               | Multiplication requires O(n^2) complexity with the usual
               | algorithm used by humans, LLMs have a constant amount of
               | computation available and they are not really efficient
               | machines for math evaluation. They can definitely
               | evaluate unseen expressions and you train a neural
               | network to learn how to do sums and multiplications, I
               | have trained models on sums and they are able to do sums
               | never seen during training, the model learns the
               | algorithm just by giving it inputs and outputs.
        
               | jdietrich wrote:
               | LLMs do contain conceptual representations and LLMs are
               | capable of abstract reasoning. This is trivially provable
               | by asking them to reason about something that is a)
               | purely abstract and b) not in the training data, e.g.
               | "All floots are gronks. Some gronks are klorps. Are any
               | floots klorps?" Any of the leading LLMs will correctly
               | answer questions of this type much more often than
               | chance.
        
               | LetsGetTechnicl wrote:
               | That is not an example of a LLM being capable of abstract
               | reasoning. Changing the question from "What is the
               | capital of United States?" which is easily answerable to
               | something completely abstract and "not in the training
               | model" doesn't change that LLM's are just very advanced
               | text prediction, and always will be. The nature of their
               | design means they are incapable of AGI.
        
               | jdietrich wrote:
               | The question I gave is a literal textbook example of
               | abstract reasoning. LLMs _are_ just very advanced text
               | prediction, but they are _also_ provably capable of
               | abstract reasoning. If you think that those statements
               | are contradictory, I would encourage you to read up on
               | the Bayesian hypotheses in cognitive science - it is
               | highly plausible that our brains are also just very
               | advanced prediction models.
        
               | nsagent wrote:
               | You're quite right that LLMs can seemingly do some
               | abstract reasoning problems, but I would not say they
               | aren't in the training data.
               | 
               | Sure, the exact form using the made up word gronk might
               | not be in the training data, but the general form of that
               | reasoning problem definitely exists, quite frequently in
               | fact.
        
               | jdietrich wrote:
               | Yes, but the general form of the problem tells you
               | nothing about the answer to any specific case. To perform
               | any better than chance, the model has to actually reason
               | through the problem.
        
               | cgag wrote:
               | Have you seen this?
               | 
               | ``` You will be given a name of an object (such as Car,
               | Chair, Elephant) and a letter in the alphabet. Your goal
               | is to first produce a 1-line description of how that
               | object can be combined with the letter in an image (for
               | example, for an elephant and the letter J, the trunk of
               | the elephant can have a J shape, and for the letter A and
               | a house, the house can have an A shape with the upper
               | triangle of the A being the roof). Following the short
               | description, please create SVG code to produce this (in
               | the SVG use shapes like ellipses, triangles etc and
               | polygons but try to defer from using quadratic curves).
               | ```
               | 
               | ``` Round 5: A car and the letter E. Description: The car
               | has an E shape on its front bumper, with the horizontal
               | lines of the E being lights and the vertical line being
               | the license plate. ```
               | 
               | Image generated here: https://imgur.com/a/Ia4Q2h3
               | 
               | How does it "just" predict the letter E could be used in
               | such a way to draw a car? How does it just text predict
               | working SVG code that draws the car made out of basic
               | shapes and the letter E?
               | 
               | I don't know how anyone could suggest there are no
               | conceptual models embedded in there.
        
               | smolder wrote:
               | Pleasure and pain, along with subtler emotions that
               | regulate our behavior, aren't things that arise from word
               | prediction, or even from understanding the world, I don't
               | think. So to say human brains are _just_ prediction
               | models seems like a mischaracterization.
        
               | brookst wrote:
               | That's a tautology that seems just as applicable to
               | humans.
        
               | roenxi wrote:
               | > LLM's are just very advanced text prediction, and
               | always will be
               | 
               | How do you predict the next word in answering an abstract
               | logic question without being capable of abstract
               | reasoning, though?
               | 
               | In some sense it probably is possible, but this is a
               | gaping flaw in your argument. A sufficiently advanced
               | text prediction process has to encompass the process of
               | abstract reasoning. The text prediction problem is
               | necessarily a superset of the abstract reasoning problem.
               | Ie, in the limit text prediction is fundamentally harder
               | than abstract reasoning.
        
               | stirfish wrote:
               | I just asked chatgpt
               | 
               | "All floots are gronks. Some gronks are klorps. Are any
               | floots klorps?"
               | 
               | ------
               | 
               | To determine if any floots are klorps, let's analyze the
               | given statements:
               | 
               | 1. All floots are gronks. This means every floot falls
               | into the category of gronks. 2. Some gronks are klorps.
               | This means there is an overlap between the set of gronks
               | and the set of klorps.
               | 
               | Since all floots are included in the set of gronks and
               | some gronks are klorps, it is possible that some floots
               | are klorps. However, we cannot conclusively say that any
               | floots are klorps without additional information. It is
               | only certain that if there is any overlap between floots
               | and klorps, it is possible, but not guaranteed, that some
               | floots are klorps.
        
               | card_zero wrote:
               | Huh, almost right. ("possible, but not guaranteed?" it's
               | necessarily true. That whole sentence was a waste of
               | space, and wrong.)
               | 
               | Edit: I mean "if there is any overlap", it's necessarily
               | true. I should have quoted the whole thing.
        
               | jdietrich wrote:
               | Nope, ChatGPT was right, the answer is indeterminable.
               | The klorps that are gronks could be a wholly distinct
               | subset to the klorps that are floots. It also correctly
               | evaluates "All gronks are floots. Some gronks are klorps.
               | Are any floots klorps?", to which the answer is
               | definitively yes.
        
               | card_zero wrote:
               | > The klorps that are gronks could be a wholly distinct
               | subset to the klorps that are floots.
               | 
               | So? It's still the case that "if there is any overlap
               | between floots and klorps," it _is_ "guaranteed, that
               | some floots are klorps." It's tautological.
               | 
               | Unless there's a way to read "overlap" so that it doesn't
               | mean "some of one category are also in the other
               | category, and vice versa"?
               | 
               | Oh, when I said "it's necessarily true" I was refering to
               | this last sentence of the output, not the question posed
               | in the input. Hence we are at cross purposes I think.
        
               | wonnage wrote:
               | Or maybe they're just pattern matching on the very
               | particular sentence structure you've chosen. This isn't a
               | convincing example at all
        
               | jdietrich wrote:
               | This isn't something I _should_ convince you of. Just
               | open up ChatGPT or Claude and try it for yourself. Think
               | up a batch of your own questions and see how a modern LLM
               | fares. I assure you that it 'll do much better than
               | chance. If you're so inclined, you can run enough tests
               | to achieve statistical significance in the course of your
               | lunch break.
               | 
               | It depresses me that we seem to be spending more time
               | arguing and hypothesising about LLMs than empirically
               | testing them. The question of whether LLMs can think is
               | completely settled, as their performance at zero-shot
               | problems is simply impossible through pure memorisation
               | or pattern-matching. The question that remains is far
               | more interesting - _how_ do they think?
               | 
               | https://arxiv.org/pdf/2205.11916
        
               | nickpsecurity wrote:
               | Given their training set, our hypothesis so far should be
               | that they're just tweaking things they've already seen by
               | applying a series of simple rules. They're still not
               | doing what human beings do. We have introspection,
               | creativity operating outside what we've seen, modeling
               | others' thoughts, planning in new domains, and so on. We
               | also operate without hallucination most of the time. I've
               | yet to see an A.I. do all of this reliably and
               | consistently. Then, that it did that without training
               | input similar to the output.
               | 
               | So, they don't just pattern match or purely memorize.
               | They do more than that. They do way less than humans.
               | Unlike humans, they also try to do everything with one or
               | a few components vs our (100-200?) brain components.
               | Crossing that gap might be achievable. It will not be
               | done by current architectures, though.
        
               | Zambyte wrote:
               | > If it were true, LLMs could multiply scalar numbers
               | together trivially.
               | 
               | FWIW most large models can do it better than I can in my
               | head.
        
               | og_kalu wrote:
               | >If it were true, LLMs could multiply scalar numbers
               | together trivially.
               | 
               | I mean, it's not like GPT-4 can't do this with more
               | accuracy than a human without a calculator.
        
             | nsagent wrote:
             | Using Occam's razor, that is less probable than the model
             | picking up on statistical regularities in human language,
             | especially since that's what they are trained to do.
        
               | mitthrowaway2 wrote:
               | That's hard to conclude from Occam's razor here. Or,
               | "statistical regularities" may have less explanatory
               | power than you think, especially if the simplest
               | statistical regularity is itself a fully predictive
               | understanding of the concept of temperature.
        
           | Davidzheng wrote:
           | It's only because you can essentially put the llms in a
           | simulations that you can have this argument. We can imagine
           | the human brain also in a simulation which we can replay over
           | and over again and adjust various parameters of the physical
           | brain to change the temperature. These sort of arguments can
           | never distinguish between llm and humans.
        
           | gwervc wrote:
           | > generative systems are basically always hallucinating: they
           | do not have any concepts at all. That they are "right" as
           | often as they are is a testament to the power of curve
           | fitting and compression of basis functions in high
           | dimensionality spaces
           | 
           | It's refreshing to read someone who "got it". Sad that before
           | my upvote the comment was grayed out.
           | 
           | Any proponent of conceptual or other wishful/magical thinking
           | shoud come with proofs, since it is the hypothesis that
           | diverge from the definition of a LLM.
        
           | buu700 wrote:
           | On that point, I would dispute the premise that "it's
           | impossible to have true language skills without implicitly
           | having a representation of self and environment". I don't see
           | any contradiction between the following two ideas:
           | 
           | 1. LLMs inherently lack any form of consciousness, subjective
           | experience, emotions, or will
           | 
           | 2. A sufficiently advanced LLM with sufficient compute
           | resources would perform on par with human intelligence at any
           | given task, insofar as the task is applicable to LLMs
        
           | drdeca wrote:
           | > I would argue it is (obviously) impossible the way the
           | current implementation of models work.
           | 
           | > How could a system which produces a single next word based
           | upon a likelihood and and a parameter called a "temperature"
           | have a conceptual model underpinning it? Even theoretically?
           | 
           | Any probability distribution over strings can theoretically
           | be factored into a product of such a "probability that next
           | token is x given that the text so far is y". Now, whether a
           | probability distribution over strings can _efficiently
           | computed_ in this form, is another question. But, if we are
           | being so theoretical that we don't care about the
           | computational cost (as long as it is finite), then the "it is
           | next token prediction" can't preclude anything which "it
           | produces a probability distribution over strings" doesn't
           | already preclude.
           | 
           | As for the temperature, given any probability distribution
           | over a discrete set, we can modify it by adding a temperature
           | parameter. Just take the log of the probabilities according
           | to the original probability distribution, scale them all by a
           | factor (the inverse of the temperature, I think. Either that
           | or the temperature, but I think it is the inverse of the
           | temperature.), then exponentiate each of these, and then
           | normalize to produce a probability distribution.
           | 
           | So, the fact that they work by next token prediction, and
           | have a temperature parameter, cannot imply any theoretical
           | limitation that wouldn't apply to any other way of expressing
           | a probability distribution over strings, as far as discussing
           | probability distributions in the abstract, over strings,
           | rather than talking about computational processes that
           | implement such probability distributions over strings.
           | 
           | But also like, going between P(next token is x | initial
           | string so far is y) and P(the string begins with z) , isn't
           | _that_ computationally costly? Well, in one direction anyway.
           | Because like, P(next token is x|string so far is y) =
           | P(string begins with yx)  / P(string begins with y) .
           | 
           | Though, one might object to P(string starts with y) over
           | P(string _is_ y) ?
        
           | fshbbdssbbgdd wrote:
           | > How could a system which produces a single next word based
           | upon a likelihood and and a parameter called a "temperature"
           | have a conceptual model underpinning it? Even theoretically?
           | 
           | Could a creature that simply evolved to survive and reproduce
           | possibly have a conceptual model underpinning it? Model
           | training and evolution are very different processes, but they
           | are both ways of optimizing a physical system. It may be the
           | case that evolution can give rise to intelligence and model
           | training can't, but we need some argument to prove that.
        
           | bubblyworld wrote:
           | Transformer models _have_ been shown to spontaneously form
           | internal, predictive models of their input spaces. This is
           | one of the most pervasive misunderstandings about LLMs (and
           | other transformers) around. It is of course also true that
           | the quality of these internal models depends a lot on the
           | kind of task it is trained on. A GPT must be able to
           | reproduce a huge swathe of human output, so the internal
           | models it picks out would be those that are the most useful
           | for that task, and might not include models of common
           | mathematical tasks, for instance, unless they are common in
           | the training set.
           | 
           | Have a look at the OthelloGPT papers (can provide links if
           | you're interested). This is one of the reasons people are so
           | interested in them!
        
             | brnt wrote:
             | > can provide links if you're interested
             | 
             | Please do :)
        
               | persnickety wrote:
               | https://thegradient.pub/othello/
        
               | bubblyworld wrote:
               | Here's the paper on OthelloGPT's internal models I
               | mentioned: https://arxiv.org/abs/2309.00941
               | 
               | The references in that paper are also good reading!
        
           | IanCal wrote:
           | > How could a system which produces a single next word based
           | upon a likelihood and and a parameter called a "temperature"
           | have a conceptual model underpinning it? Even theoretically?
           | 
           | You're limiting your view of their capabilities on the output
           | format.
           | 
           | > Not so with LLMs!! Generative LLMs do not have a prior
           | concept available before they start emitting text.
           | 
           | How do you establish that? What do you think of othellogpt?
           | That seems to form an internal world model.
           | 
           | > That the "temperature" can chaotically change the output as
           | the tokens proceed
           | 
           | Changing the temperature _forcibly makes the model pick words
           | it thinks fit worse_. Of course it changes the output. It 's
           | like an improv game with someone shouting "CHANGE!".
           | 
           | Let's make two tiny changes.
           | 
           | One, let's tell a model to use the format
           | 
           | <innerthought>askjdhas</innerthought> as the voice in their
           | head, and <speak>blah</speak> for the output.
           | 
           | Second, let's remove temperature and keep it at 0 so we're
           | not playing a game where we force them to choose different
           | words.
           | 
           | Now what remains of the argument?
        
         | Lerc wrote:
         | >that sufficiently advanced mimicry is not only
         | indistinguishable from the real thing, but at the limit in fact
         | is the real thing.
         | 
         | While sufficiently does a lot of the heavy lifting here, the
         | indistinguishable criteria implicitly means there must be no-
         | way to tell if it is not the real thing. The belief that it
         | _is_ the real thing comes from the intuition that anything that
         | can be everything a person must be, but have that fundamental
         | essence of being a person. I don 't think people could really
         | conceive an alternative without resorting to prejudice which
         | they could equally apply to machines or people.
         | 
         | I take the arguments such as in this paper to be instead making
         | the claim that because X cannot be Y you will never be able to
         | make X indistinguishable from Y. It is more a prediction of
         | future failure than a judgment on an existing thing.
         | 
         | I end up looking at some of these complaints from the point of
         | view of my sometimes profession of Game Developer. When I show
         | someone a game in development to playtest they will find a
         | bunch of issues. The vast majority of those issues, not only am
         | I already aware of, but I have a much more detailed perspective
         | of what the problem is and how it might be fixed. I have been
         | seeing the problem, over and over, every day as I work. The
         | problem persists because there are other things to do before
         | fixing the issue, some of which might render the issue
         | redundant anyway.
         | 
         | I feel like a lot of the criticisms of AI are like this they
         | are like the playtesters pointing out issues in the current
         | state where those working on the problems are generally well
         | aware of particular issues and have a variety of solutions in
         | mind that might help.
         | 
         | Clear statements of deficiencies in ability are helpful as a
         | guide to measure future success.
         | 
         | I'm also in the camp that LLM's cannot be an AGI on its own, on
         | the other hand I do think the architecture might be extended to
         | become one. There is an easy out for any criticism to say,
         | "Well, it's not an LLM anymore".
         | 
         | In a way that ends up with a lot of people saying
         | 
         | .The current models cannot do the things we know the current
         | models cannot do
         | 
         | .Future models will not be able to do those things if they are
         | the same as the current ones
         | 
         | .Therefore the things that will be able to do those things will
         | be different
         | 
         | That _is_ true, but hardly enlightening.
        
           | et1337 wrote:
           | > Future models will not be able to do those things if they
           | are the same as the current ones
           | 
           | I think a lot of people disagree with this. People think if
           | we just keep adding parameters and data, magic will happen.
           | That's kind of what happened with ChatGPT after all.
        
             | Lerc wrote:
             | I'm not so sure that view is very widespread amongst people
             | familiar with how LLMs work. Certainly they become more
             | capable with parameters and data, but there are fundamental
             | things that can't be overcome with a basic model and I
             | don't think anyone is seriously arguing otherwise.
             | 
             | For instance LLMs are pretty much stateless without their
             | context window. If you treat the raw generated output as
             | the first and final result then there is very little scope
             | for any advanced consideration of anything.
             | 
             | If you give it a nice long context, give it the ability to
             | edit that context or even access to a key-value function
             | interface, then treat everything it says as internal
             | monologue except for anything in <aloud></aloud> tags which
             | is what the user gets to see. There are plenty of people
             | who see AGI somewhere along that path, but once you take a
             | step down that path, it's no-longer "Just an LLM" the LLM
             | is a component in a greater system.
        
               | lucianbr wrote:
               | Has anyone done the <aloud> thing, and achieved some
               | interesting results? Seems a pretty obvious thing to try,
               | but I never heard of anything like it.
        
               | thomashop wrote:
               | I've seen automated AI agents that can spend time
               | reflecting on themselves in a feedback loop. The model
               | alters its state over time and can call APIs.
               | 
               | You could equate saying something "aloud" to calling an
               | API.
        
               | Lerc wrote:
               | I noticed some examples from anthropic's golden-gate-
               | claude paper had responses starting with <scratchpad> for
               | the inverse effect. Suppressing the output to the end of
               | the paragraph would be an easy post processing operation.
               | 
               | It's probably better to have implicitly closed tags
               | rather than requiring a close tag. It would be quite easy
               | for a LLM to miss a close tag and be off in a dreamland.
               | 
               | Possibly addressing comments to the user or itself might
               | allow for considering multiple streams of thought
               | simultaneously. IRC logs would be decent training data
               | for it to figure out many voice multi-conversations
               | (maybe)
        
               | imtringued wrote:
               | The problem with <aloud></aloud> is that you need the
               | internal monologue to not be subject to training loss,
               | otherwise the internal monologue is restricted to the
               | training distribution.
               | 
               | Something people don't seem to grasp is that the training
               | data mostly doesn't contain any reasoning. Nobody has
               | published brain activity recordings on the internet, only
               | text written in human language.
               | 
               | People see information, process it internally in their
               | own head which is not subject to any outside authority
               | and then serialize the answer to human language, which is
               | subject to outside authorities.
               | 
               | Think of the inverse. What if school teachers could read
               | the thoughts of their students and punish any student
               | that thinks the wrong thoughts. You would expect the
               | intelligence of the class to rapidly decline.
        
               | skybrian wrote:
               | That does sounds invasive, but on the other hand, math
               | teachers do tell the kids to "show their work" for good
               | reasons. And the consent issues don't apply for LLM
               | training.
               | 
               | I wonder if the trend towards using synthetic, AI-
               | generated training data will make it easier to train
               | models that use <aloud> effectively? AI's could be
               | trained to use reasoning and show their work more than
               | people normally do when posting on the Internet. It's not
               | going to create information out of nothing, but it will
               | better model the distribution that the researchers want
               | the LLM to have, rather than taking distributions found
               | on the Internet as given.
               | 
               | It's not a natural distribution anyway. For example, I
               | believe it's already the case that people train AI with
               | weighted distributions - training more on Wikipedia, for
               | example.
               | 
               | My guess is that the quest for the best training data has
               | only just begun.
        
               | Lerc wrote:
               | I think you are looking at a too narrowly defined avenue
               | to achieve this effect.
               | 
               | There are multiple avenues to train a model to do this.
               | Most simply is a finetune on training examples where the
               | the internal monologue is constructed in a manner that
               | precedes the <aloud> tag and provides additional
               | reasoning before the output.
               | 
               | I think there is also scope for pretraining with a mask
               | to not attempt to predict (or ignore the loss, same
               | thing) certain things in the stream. For example to give
               | time codes into the data stream. The training could then
               | have an awareness of the passing of time but would not
               | generate time codes as a prediction. Time codes could
               | then be injected into the context at inference time and
               | it would be able to use that data.
        
           | skybrian wrote:
           | One of the issues here is that future-focused discussions
           | often lead to wild speculation because we don't know the
           | future. Also, there's often too much confidence in people's
           | preferred predictions (skeptical or optimistic) and it would
           | be less heated if we admitted that we don't know how things
           | will look even a couple of years out, and alternative
           | scenarios are reasonable.
           | 
           | So I think you're right, it's not enlightening. Criticism of
           | overconfident predictions won't be enlightening if you
           | already believe that they're overconfident and the future is
           | uncertain. Conversations might be more interesting if not so
           | focused on bad arguments of the other side.
           | 
           | But perhaps such criticism is still useful. How else do you
           | deflate excessive hype or skepticism?
        
         | brookst wrote:
         | > sufficiently advanced mimicry is not only indistinguishable
         | from the real thing, but at the limit in fact is the real thing
         | 
         | I am continually surprised at how relevant and _pervasive_ one
         | of Kurt Vonnegut's major insights is: "we are what we pretend
         | to be, so we must be very careful about what we pretend to be"
        
           | Der_Einzige wrote:
           | This ideas is older than him by a lot
           | 
           | https://en.wikipedia.org/wiki/Life_imitating_art
           | 
           | Everyone in the "life imitates art, not the other way around"
           | camp (and also neo-platonists/gnostics i.e.
           | https://en.wikipedia.org/wiki/Demiurge ) is getting massively
           | validated by the modern advances in AI right now.
        
         | sriku wrote:
         | The crux of the video game analogy seems to be that when you go
         | close to an object, the resolution starts blurring and the
         | illusion gets broken, and there is a similar thing that happens
         | with LLMs (as of today) as well. This is, so far, reasonable
         | based on daily experience with these models.
         | 
         | The extension of that argument being made in the paper is that
         | a model trained on language tokens spewed by humans is
         | _incapable_ of actually reaching that limit where this illusion
         | will _never_ breakdown in resolution. That also seems
         | reasonable to me. They use the word  "languaging" in verb form
         | as opposed to "language" as a noun to express this.
        
           | 8n4vidtmkvmk wrote:
           | Why are LLMs incapable of reaching that limit? It's very easy
           | to imagine video games getting to that point. We have all the
           | data to see objects right down to the atomic level, which is
           | plenty more than you'd need for a game. It's mostly a matter
           | of compute. Why then should LLMs breakdown if they can at
           | least mimic the smartest humans? We don't need "resolution"
           | beyond that.
        
             | YeGoblynQueenne wrote:
             | If you're talking about machine-learnability of languages
             | then there's two frameworks that are relevant: Language
             | Identification in the Limit and PAC-Learning.
             | 
             | Language Identification in the Limit in short tells us that
             | if there is an automaton equivalent to human language then,
             | if it's at most a regular automaton it can be identified
             | ("learned") by a number of positive only examples
             | approaching infinity, and if it's above regular then a
             | number of negative examples approaching infinity is also
             | needed to identify it. Chomsky based his "Poverty of the
             | Stimulus" argument about linguistic nativism (the built-in
             | "language faculty" of humans) on this result, known as
             | Gold's Result after Mark E. Gold who proved it in the
             | setting of Inductive Inference in 1964. Gold's result is
             | not controversial, but Chomsky's use of it has seen no end
             | of criticism, many from the computational linguistics
             | community (including people in it that have been great
             | teachers to me, without having ever met me, like Charniak,
             | Manning and Schutze, and Jurafsky and Martin) [1].
             | 
             | Those critics generally argue that human language can be
             | learned like everything and anything else: with enough data
             | drawn from a distribution assumed identical to the true
             | distribution of the data in the concept to be learned, and
             | allowing a finite amount of error with a given probability,
             | i.e. under Probably Approximately Correct Learning
             | assumptions, the learning setting introduced by Leslie
             | Valiant in 1984, that replaced Inductive Inference and that
             | serves as the theoretical basis of modern statistical
             | machine learning, in the rare cases where someone goes
             | looking for one. Around the same time that Valiant was
             | describing PAC-Learning, Vapnik and Chervonenkis were
             | developing their statistical learning theory behind the
             | Iron Curtain and if you take a machine learning course in
             | school you'll learn about the VC Dimension and wonder
             | what's that got to do with AI and LLMs.
             | 
             | The big question is how relevant is all this to a) human
             | language and b) learning human language with an LLM. Is
             | there an automaton that is equivalent to human language? Is
             | human language PAC-learnable (i.e. from a polynomial number
             | of examples)? There must be some literature on this in the
             | linguistics community, possibly the cognitive science
             | community. I don't see these questions asked or answered in
             | machine learning.
             | 
             | Rather, in machine learning people seem to assume that if
             | we throw enough data and compute at a problem it must
             | eventually go away, just like generals of old believed that
             | if they sacrifice enough men in a desperate assault they
             | will eventually take Elevation No. 4975 [2]. That's of
             | course ignoring all the cases in the past where throwing a
             | lot of data and compute at a problem either failed
             | completely -which we usually don't hear anything about
             | because nobody publishes negative results, ever- or gave
             | decidedly mixed results, or hit diminishing returns; as a
             | big example see DeepMind's championing of Deep
             | Reinforcement Learning as an approach to _real world_
             | autonomous behaviour, based on the success of the approach
             | in virtual environments. To be clear, that hasn 't worked
             | out and DeepMind (and everyone else) has so far failed to
             | follow the glory of AlphaGo and kin with a real-world
             | agent.
             | 
             | So in short, yeah, there's a lot to say that we may never
             | have enough data and compute to achieve a good enough
             | approximation of human linguistic ability with a large
             | language model, or something even larger, bigger, stronger,
             | deeper, etc.
             | 
             | __________________
             | 
             | [1] See: https://languagelog.ldc.upenn.edu/myl/ldc/swung-
             | too-far.pdf for a history of the debate.
             | 
             | [2] https://youtu.be/MWS5MfJUbUg?si=qovJBV1sFDbjJf19
        
             | corimaith wrote:
             | That depends if you believe natural language alone is
             | sufficient to fully model reality. Probably not, it can
             | approximate to a high degree, but there is a reason we
             | resort to formal, constructed languages in math or CS to
             | express our ideas.
        
               | TeMPOraL wrote:
               | LLMs aren't trained solely on natural language. They also
               | ingest formal notation from every domain and at every
               | level (from preschool to PhD); they see code and markup
               | in every language even remotely popular. They see various
               | encodings, binary dumps, and nowadays also diagrams. The
               | training data has all that's needed to teach them great
               | many formal languages and how to use them.
        
         | exe34 wrote:
         | > . I feel similarly about this as to what I've read of
         | Chalmers - I agree with pretty much all of the conclusions, but
         | I don't feel like the text would convince me of those
         | conclusions if I disagreed;
         | 
         | my limited experience of reading Chalmers is that he doesn't
         | actually present evidence - he goes on a meandering rant and
         | then claims to have proved things that he didn't even cover. it
         | was the most infuriating read of my life, I heavily annotated
         | two chapters and then finally gave up and donated the book.
        
           | zoogeny wrote:
           | I haven't read any Chalmers so I can't comment on his writing
           | style. I have seen him in several videos on discussion panels
           | and on podcasts.
           | 
           | One thing I appreciate is he often states his premises, or
           | what modern philosophers seem to call "commitments". I
           | wouldn't go so far as to say he uses air-tight logic to
           | reason from these premises/commitments to conclusions - but
           | at the least his reasoning doesn't seem to stray too far from
           | those commitments.
           | 
           | I think it would be fair to argue that not all of his
           | commitments are backed by physical evidence (and perhaps some
           | of them could be argued to go against some physical
           | evidence). And so you are free to reject his commitments and
           | therefore reject his conclusions.
           | 
           | In fact, I think the value of philosophers like Chalmers is
           | less in their specific commitments and conclusions and more
           | in their framing of questions. It can be useful to list out
           | his commitments and find out where you stand on each of them,
           | and then to do your own reasoning using logic to see what
           | conclusions your own set of commitments forces you into.
        
             | exe34 wrote:
             | yeah while reading the book he would keep saying things
             | that are factually wrong or just state that things are
             | impossible, basically he builds the conclusion into the
             | premises and then discovers the conclusions like he just
             | defended them.
        
         | vouwfietsman wrote:
         | Isn't any formal "proof" or "reasoning" that shows that
         | something cannot be AGI inherently flawed, because we have a
         | hard time formally describing what AGI is anyway.
         | 
         | Like your argument: embodiment is missing in LLMs, but is it
         | needed for AGI? Nobody knows.
         | 
         | I feel we first have to do a better job defining the basics of
         | intelligence, we can then define what it means to be an AGI,
         | and only then can we prove that something is, or is not, AGI.
         | 
         | It seems that we skipped step 1 because its too hard, and
         | jumped straight to step 3.
        
           | GeneralMayhem wrote:
           | Yep, this is a big part of it. Intelligence and consciousness
           | are barely understood beyond "I'll know it when I see it",
           | which doesn't work for things you can't see - and in the case
           | of consciousness, most definitions are explicitly based on
           | concepts that are not only invisible but ineffable. And then
           | we have no solid idea whether these things we can't really
           | define, detect, or explain are intrinsically linked to each
           | other or have a causal relationship in either direction.
           | Almost any definition you pick is going to lead to some
           | unsatisfying conclusions vis a vis non-human animals or
           | "obviously not intelligent" forms of machine learning.
           | 
           | It's a real mess.
        
           | Vecr wrote:
           | AIXItl is a formally described AI. Not an AI you'd want, and
           | not an AI you could really build, but it's there.
        
         | dullcrisp wrote:
         | Everyone seems to want to discuss whether there's some
         | fundamental qualia preventing my toaster from being an AGI, but
         | no one is interested in acknowledging that my toaster isn't an
         | AGI. Maybe a larger toaster would be an AGI? Or one with more
         | precise toastiness controls? One with more wattage?
        
         | plasticeagle wrote:
         | There are many finite problems that absolutely do not admit
         | finite solutions. Full stop.
         | 
         | I think the deeper point of the paper is that you simply cannot
         | generate an intelligent entity by just looking at recorded
         | language. You can create a dictionary, and a map - but one must
         | not mistake this map for the territory.
        
           | mitthrowaway2 wrote:
           | The human brain is a finite solution, so we already have an
           | existence proof. That means a lot for our confidence in the
           | solvability of this kind of problem.
           | 
           | It is also not universally impossible to reconstruct a
           | function of finite complexity from only samples of its inputs
           | and outputs. It is sometimes possible to draw a map that is
           | an exact replica of the territory.
        
             | plasticeagle wrote:
             | Trying to recreate a "human brain" is an absolutely
             | terrible idea - and is not something we should even
             | attempt. The consequences of success are terrible.
             | 
             | They're not really trying to create a human brain, so far
             | as I can tell. They're trying to create an oracle, by
             | feeding it all existing human utterances. This is certainly
             | not going to succeed, since the truth is not measurable
             | post-facto from these utterances.
             | 
             | The claim regarding reconstructing functions from samples
             | of its ins and outs is false. It's false both
             | mathematically, where "finite complexity" doesn't really
             | even have a rigorous definition - and metaphorically too.
             | 
             | Maps are never the territory.
        
               | mitthrowaway2 wrote:
               | Sometimes maps are the territory, especially when the
               | territory that is being mapped is itself a map. An
               | accurate map of a map can be a copy of the map that it
               | maps. The human brain's concept of reality is not
               | reality, it's a map of reality. A function trained to
               | predict human outputs can itself contain a map which is
               | arbitrarily similar to the map that a human carries in
               | their own head.
               | 
               | (Finite complexity is rigorously definable, it's just
               | that the definition is domain-specific).
        
         | YeGoblynQueenne wrote:
         | >> Again, that conclusion feels wrong to me... but if I'm being
         | honest with myself, I can't point to why, other than to point
         | at some form of dualism or spirituality as the escape hatch.
         | 
         | I like how Chomsky deals with it who doesn't have any
         | spirituality at all, the big degenerate materialist:
         | 
         |  _As far as I can see all of this [he 's speaking about the
         | Loebner Prize and the Turing test in general] is entirely
         | pointless. It's like asking how we can determine empirically
         | whether an aeroplane can fly the answer being if it can fool
         | someone into thinking that it's an eagle under some
         | conditions._
         | 
         | https://youtu.be/0hzCOsQJ8Sc?si=MUXpmIwAzcla9lvK&t=2052
         | 
         | (My transcript)
         | 
         | He's right, you know. It should be possible to tell whether
         | something is intelligent just as easily as it is to say that
         | something is flying. If there are endless arguments about it,
         | then it's probably not intelligent (yet). Conversely, if
         | everyone can agree it is intelligent then it probably is.
        
           | persnickety wrote:
           | I can't disagree more. Or maybe I actually agree.
           | 
           | Because it's not easy to tell whether something is flying.
           | Definitions like that fall apart every time we encounter
           | something out of the ordinary. If you take the criterion of
           | "there's no discussion about it", then you're limiting the
           | definition to that which is familiar, not that which is
           | interesting.
           | 
           | Is an ekranoplan flying? Is an orbiting spaceship flying? Is
           | a hovercraft flying? Is a chicken flapping its wings over a
           | fence flying?
           | 
           | Your criterion would suggest the answer of "no" to any of
           | those cases, even though those cover much of the same use
           | cases as flying, and possibly some new, more interesting
           | ones.
           | 
           | And I don't think an AGI must be limited to the familiar
           | notion of intelligence to be considered an AGI, or, at the
           | very least, to open up avenues that were closed before.
        
             | misnome wrote:
             | "It's not flying, it's falling... with style"
        
               | YeGoblynQueenne wrote:
               | I always fall with style and I always do it on purpose :|
        
             | YeGoblynQueenne wrote:
             | There are going to be gray areas of course, but the point
             | I'm making is that if it's hard to argue something isn't
             | flying (respectively, intelligent) then it's probably
             | flying (resp. intelligent). If it's hard to tell then it's
             | probably not. I'm suggesting that intelligence, like
             | flying, should be very immediately obvious.
             | 
             | For example, you can't miss the fact that a five-year old
             | child is intelligent and you can't miss the fact that a
             | stone is not. There may be all sorts of things in between
             | for which we can't be sure, or whose intelligence depends
             | on definition, or point of view, etc. but when something is
             | intelligent then it should leave us no doubt that it is.
             | Or, if you want to see it this way: if something is as
             | intelligent as a five-year old child then it should leave
             | us no doubt that it is.
             | 
             | I'm basically arguing for placing the bar high enough that
             | when it is passed, we can be fairly certain we're not
             | mistaken.
             | 
             | >> I can't disagree more. Or maybe I actually agree.
             | 
             | I find myself in that disposition often :)
        
               | mitthrowaway2 wrote:
               | > when something is intelligent then it should leave us
               | no doubt that it is.
               | 
               | I strongly disagree. There are many reasons we might not
               | recognize its intelligence, such as:
               | 
               | - it operates on a different timescale than we do.
               | 
               | - it operates at a different size scale than we do.
               | 
               | - we don't understand its language, its methods, or its
               | goals
               | 
               | - Cartesian-like ideological blindness ("only humans have
               | experience, all other things are automata, no matter how
               | much they seem otherwise")
               | 
               | Throughout human history, certain people have even
               | managed to doubt the intelligence of other groups of
               | humans.
        
             | stoperaticless wrote:
             | > Your criterion would suggest the answer of "no" to any of
             | those cases, even though those cover much of the same use
             | cases as flying, and possibly some new, more interesting
             | ones.
             | 
             | Is it a problem though? Their existence are unrelated to
             | how we categorize them.
             | 
             | That matters only in communication. "if everybody agrees"
             | lowers/removes the risk of miscommunication.
             | 
             | If "hovercraft is flying" for you, but not for 50% the
             | world, it makes it somewhat more difficult to communicate.
             | (Easily solved with some qualifications, but that requires
             | admission the questionability of "hovercraft is flying")
             | 
             | > you're limiting the definition to that which is familiar,
             | not that which is interesting.
             | 
             | You made an Interesting point - good food for thought.
             | 
             | Counterpoint: It seems natural and useful that only similar
             | things get to use same word.
             | 
             | > And I don't think an AGI must be limited ...
             | 
             | Could you expand on why does it matter and what would be
             | impacted by such lenient (or strict) classification?
        
               | persnickety wrote:
               | I think it matters merely by the way we set our
               | expectations relative to what is going to come - and what
               | has come already. I'm feeling an undercurrent of thought
               | that is implying: this is not X (intelligence,
               | understanding, whatever), so there's no need to consider
               | it seriously.
               | 
               | In the same vein:
               | https://eschwitz.substack.com/p/strange-intelligence-
               | strange...
        
               | stoperaticless wrote:
               | > I'm feeling an undercurrent of thought that is
               | implying: this is not X, so there's no need to consider
               | it seriously.
               | 
               | True. I doubt that field experts are directly affected by
               | the naming, but indirect effect might come via less
               | knowledgable (AI wise) financial decision makers.
               | 
               | I see a risk that those decision makers (and society)
               | would be mislead if they were promised AGI (based on
               | their "strict" understanding, what's in the movies), but
               | received AGI (based on "relaxed" meaning). Informed
               | consent is usually good.
               | 
               | Though surely that can be resolved with more public
               | discourse; maybe "relaxed" version will become the
               | default expectation.
        
         | epicfile wrote:
         | The only thing this paper prove is that folks at Trinity
         | College in Dublin are poor, envious anthropocentric drunkards,
         | ready to throw every argument to defend their crown of
         | creating, without actually understanding the linguistics
         | concepts they use to make their argument.
        
         | Salgat wrote:
         | To me LLMs seem to most closely resemble the regions of the
         | brain used for converting speech to abstract thought and vice-
         | versa, because LLMs are very good at generating natural
         | language and knowing the flow of speech. An LLM is similar to
         | if you took the the Wernicke's and Broca's Areas and stuck a
         | regression between them. The problem is that the regression in
         | the middle is just a brute force of the entire world's
         | knowledge instead of a real thought.
        
           | randcraw wrote:
           | I think the major lessons from the success of LLMs are two:
           | 1) the astonishing power of a largely trivial association
           | engine based only on the semantic categories inferred by
           | word2vec, and 2) that so much of the communication abilities
           | of the human mind require so little rational thought (since
           | LLMs demonstrate essentially none of the skills in Kahneman's
           | and Tversky's System 2 thinking (logic, circumspection, self-
           | correction, reflection, etc).
           | 
           | I guess this also disproves Minsky's 'Society of Mind'
           | conjecture - a large part of human cognition (System 1) does
           | not require the complex interaction of heterogeneous mental
           | components.
        
             | GeneralMayhem wrote:
             | What makes this tough is that LLMs _can_ show logical
             | thinking and self-correction when specifically prompted
             | (e.g.  "think step by step", "double-check and then correct
             | your work"). It seems unlikely that they can truthfully
             | self-reflect, but I don't think it's strictly impossible.
        
               | Terr_ wrote:
               | > LLMs can show logical thinking and self-correction
               | 
               | The same way they "show" sadness or contrition or
               | excitement?
               | 
               | We need to be careful with our phrasing here: LLMs can be
               | prompted to provide you associated _phrases_ that usually
               | seem to fit with the rest of the word-soup, but whether
               | the model is actually demonstrating  "logical thinking"
               | or "self-correction" is a Chinese Room problem [0]. (Or
               | else a "No, it doesn't, I can tell because I checked the
               | code.")
               | 
               | [0] https://en.wikipedia.org/wiki/Chinese_room
        
         | zoogeny wrote:
         | > On embodiment - yes, LLMs do not have corporeal experience.
         | 
         | My own thought on this (as someone who believes embodiment is
         | essential) is to consider the rebuttals to Searle's Chinese
         | Room thought experiment.
         | 
         | For now (and the foreseeable future) humans are the embodiment
         | of LLMs. In some sense, we could be seen as playing the role of
         | a centralized AIs nervous system.
        
           | TeMPOraL wrote:
           | Rebuttals of Chinese rooms _are_ also rebuttals of embodiment
           | as a requirement! To say the system of person+books speaks
           | Chinese is to say that good enough _emulation_ of a process
           | has all the qualities of the emulated process, and can
           | substitute for it. Embodiment then cannot be essential,
           | because we could emulate it instead.
        
         | belter wrote:
         | "Beyond the Hype: A Realistic Look at Large Language Models" -
         | https://news.ycombinator.com/item?id=41026484
        
       | dboreham wrote:
       | The first stage is denial.
        
         | nativeit wrote:
         | Well, I suppose that's rather convenient.
        
       | nativeit wrote:
       | I'm more or less a layperson when it comes to LLMs and this
       | nascent concept of AI, but there's one argument that I keep
       | seeing that I feel like I understand, even without a thorough
       | fluency with the underlying technology. I know that neural nets,
       | and the mechanisms LLMs employ to train and form relational
       | connections, can plausibly be compared to how synapses form
       | signal paths between neurons. I can see how that makes intuitive
       | sense.
       | 
       | I'm struggling to articulate my cognitive dissonance here, but is
       | there any empirical evidence that LLMs, or their underlying
       | machine learning technology, share anything at all with
       | biological consciousness beyond a convenient metaphor for
       | describing "neural networks" using terms borrowed from
       | neuroscience? I don't know that it necessarily follows that just
       | because something was inspired by, or is somehow mimicking, the
       | structure of the brain and its basic elements, that it should
       | necessarily relate to its modeled reality in any literal way, let
       | alone provide a sufficient basis for instantiating a phenomena we
       | frankly know very little about. Not for nothing, but our models
       | naturally cannot replicate any biological functions we do not
       | fully understand. We haven't managed to reproduce biological
       | tissues that are exponentially less complex than the brain, are
       | we really claiming that we're just jumping straight past lab-
       | grown t-bones to intelligent minds?
       | 
       | I'm sure most of the people reading this will have seen Matt
       | Parker's videos where they "teach" matchbooks to win a game
       | against humans. Is anyone suggesting those matchbooks, given
       | infinite time and repetition, would eventually spark emergent
       | consciousness?
       | 
       | > The argument would be that that conceptual model is encoded in
       | the intermediate-layer parameters of the model, in a different
       | but analogous way to how it's encoded in the graph and chemical
       | structure of your neurons.
       | 
       | Sorry if I have misinterpreted anyone. I honestly thought all the
       | "neuron" and "synapse" references were handy metaphors to explain
       | otherwise complex computations that resemble this conceptual idea
       | of how our brains work. But it reads a lot like some of the folks
       | in this thread believe it's much more than metaphors, but rather
       | a literal analog.
        
         | obirunda wrote:
         | I don't think anyone in research actually believes this. Note
         | that the whole idea behind claiming "scaling laws" will
         | infinitely improve these models is a funding strategy rather
         | than a research one. None of these folks think human-like
         | consciousness will "rise" from this effort, even though they
         | veil it to continue the hype-cycle. I guarantee all these firms
         | are desperately looking for architectural breakthroughs, even
         | while they wax poetic about scaling laws, they know there is a
         | bottleneck ahead.
         | 
         | Notice how LeCun is the only researcher being honest about this
         | in a public fashion. Meta is committed to AI already and will
         | at least match the spend of competitors anyway, so he doesn't
         | have as much pressure to try and convince investors that this
         | rabbit whole is deeper.
         | 
         | Don't get me wrong, LLMs are a tremendous improvement on
         | knowledge compression and distillation, but it's still
         | unreliable enough that old school search is likely a superior
         | method nonetheless.
        
           | xpe wrote:
           | I don't hold LeCun's opinions in high regard because of his
           | often hyperbolic statements.
        
           | xpe wrote:
           | Put aside consciousness or hype or investment. Look at the
           | results; LLMs are well beyond old-school search in many ways.
           | Sure, they are flawed in someways. Previous paradigms for
           | search, were also flawed in their own ways.
           | 
           | Look at the arc of NLP. Large language models fit the
           | pattern. One could even say that their development (next
           | token prediction with a powerful function approximator) is
           | obvious in hindsight.
        
             | obirunda wrote:
             | Honestly I don't disagree, I just think that humans tend to
             | anthropomorphize to such a high extent that there is a fair
             | bit of hyperbole promoting LLMs as more than they are. It's
             | my opinion that the big flaws LLMs currently present aren't
             | going to be overcome by scaling alone.
        
               | xpe wrote:
               | Scaling existing architectures (inference I mean) will
               | probably help a lot. Combine that with better training
               | and hybrid architectures, and I personally expect to see
               | continued improvement.
               | 
               | However, given the hype cycle, combined with broad levels
               | of ignorance of how LLMs work, it is an open question if
               | even amazing progress will impress people anymore.
        
         | throwthrowuknow wrote:
         | There isn't really any reason biological neurons should relate
         | to their modelled reality, what does a single cell care about
         | poetry or even simple things like a chair?
        
           | bamboozled wrote:
           | A chair isn't only a chair, it can be a table, a bookshelf,
           | and many others things. The real world is hard.
        
         | xpe wrote:
         | I find discussions of consciousness even more taxing than
         | religion, free will, or politics.
         | 
         | With very careful discussion, there are some really interesting
         | concepts in play. This paper however does not strike me as
         | worth most people's time. Especially not regarding
         | consciousness.
        
       | kazinator wrote:
       | The authors of this paper are just another instance of the AI
       | hype being used by people who have no connection to it, to
       | attract some kind of attention.
       | 
       | "Here is what we think about this current hot topic; please read
       | our stuff and cite generously ..."
       | 
       | > _Language completeness assumes that a distinct and complete
       | thing such as `a natural language ' exists, the essential
       | characteristics of which can be effectively and comprehensively
       | modelled by an LLM_
       | 
       | Replace "LLM" by "linguistics". Same thing.
       | 
       | > _The assumption of data completeness relies on the belief that
       | a language can be quantified and wholly captured by data._
       | 
       | That's all that a baby has, who becomes a native speaker of their
       | surrounding language. Language acquisition does not imply
       | totality of data. Not every native speaker recognizes exactly the
       | same vocabulary and exactly the same set of grammar rules.
        
         | IshKebab wrote:
         | Babies have feedback and interaction with someone speaking to
         | them. Would they learn to speak if you just dumped them in
         | front of a TV and never spoke to them? I'm not sure.
         | 
         | But anyway I agree with you. This is just a confused HN comment
         | in paper form.
        
           | xpe wrote:
           | I personally don't get much value out of the paper, but it is
           | orders of magnitude more substantive and thoughtful than a
           | median "confused Hacker News comment".
        
           | keybored wrote:
           | > Babies have feedback and interaction with someone speaking
           | to them. Would they learn to speak if you just dumped them in
           | front of a TV and never spoke to them? I'm not sure.
           | 
           | Feedback and interaction is not vital for acquisition for
           | secondary language learning at least according to one theory.
           | 
           | And if that's good enough for adults it might be good enough
           | for sponge-brain babies.
           | 
           | https://en.wikipedia.org/wiki/Input_hypothesis
        
         | JohnKemeny wrote:
         | They are two researchers/assistant professors working with
         | cognitive science, psychology, and trustworthy AI. The paper is
         | peer reviewed and has been accepted for publication in the
         | Journal of Language Sciences.
         | 
         | You should publish your critique of their research in that same
         | journal.
         | 
         | P.s. if you find any grave mistakes, you can contact the editor
         | in chief, who happens to be a linguist.
        
           | kazinator wrote:
           | > _You should publish your critique of their research in that
           | same journal._
           | 
           | No thanks; that would be at least twice removed from Making
           | Stuff.
           | 
           | (Once removed is writing about Making Stuff.)
        
             | lgas wrote:
             | One might argue that a critique itself is stuff.
        
           | lucianbr wrote:
           | An appeal to authority if ever there was one.
           | 
           | Their critique is written here, in plain english. Any fault
           | with it you can just mention. The "I won't read your comment
           | unless you get X journal to publish it" seems really
           | counterproductive. Presumably even the great Journal of
           | Language Sciences is not above making mistakes or publishing
           | things that are not perfect.
        
             | YeGoblynQueenne wrote:
             | >> An appeal to authority if ever there was one.
             | 
             | I read it as a clear refutation of the assertion that the
             | authors "have no connection" to AI (or to AI hype; unclear
             | from the OP).
             | 
             | Btw, the OP is a typical ad-hominem, drawing attention to
             | who is speaking rather than what they're saying.
        
           | mquander wrote:
           | The "efficient journal hypothesis" -- if something is written
           | in a paper in a journal, then it's impossible for anyone to
           | know any better, since if they knew better, they would
           | already have published the correction in a journal.
        
         | xpe wrote:
         | Please argue on the merits and substance. I'm less interested
         | in speculation on the authors' motivations.
        
           | xpe wrote:
           | The parent comment I responded to is speculative and does not
           | argue on the merits. We can do better here.
           | 
           | Are there people who ride the hype wave of AI? Sure.
           | 
           | But how can you tell from where you sit? How do you come to
           | such a judgment? Are you being thoughtful and rational?
           | 
           | Have you considered an alternative explanation? I think the
           | odds are much greater that the authors' academic
           | roots/training is at odds with what you think is productive.
           | (This is what I think, BTW. I found the paper to be a waste
           | of my time. Perhaps others can get value from it?)
           | 
           | But I don't pretend to know the authors' motivations, nor
           | will I cast aspersions on them.
           | 
           | When one casts shade on a person like the comment above did,
           | one invites and deserves this level of criticism.
        
       | amne wrote:
       | tl:dr; we're duck-typing LLMs as AGI
        
       | Simon_ORourke wrote:
       | Where I work, there's a somewhat haphazardly divided org
       | structure, where my team has some responsibility to answer the
       | executives demands for "use AI to help our core business". So we
       | applied off-the-shelf models to extract structured context from
       | mostly unstructured text - effectively a data engineering job -
       | and thereby support analytics and create more dashboards for the
       | execs to mull over.
       | 
       | Another team, with a similar role in a different part of the org
       | has jumped (feet first) into optimizing large language models to
       | turn them into agents, without consulting the business about
       | whether they need such things. RAG, LoRA and all this
       | optimization is well and good, but this engineering focus has
       | found no actual application, expect wasting several million bucks
       | hiring staff to do something nobody wants.
        
       | rramadass wrote:
       | See also _Beyond the Hype: A Realistic Look at Large Language
       | Models * Jodie Burchell * GOTO 2024_ -
       | https://www.youtube.com/watch?v=Pv0cfsastFs
        
       | flimflamm wrote:
       | How would the authors consider a paralyzed individual who can
       | only move their eyes since birth? That person can learn the same
       | concepts as other humans and communicate as richly (using only
       | their eyes) as other humans. Clearly, the paper is viewing the
       | problem very narrowly.
        
         | fairthomas wrote:
         | > _...a paralyzed individual who can only move their eyes since
         | birth..._
         | 
         | I don't think such an individual is possible.
        
           | throwthrowuknow wrote:
           | I didn't want to Google it for you because it always makes me
           | sad but things like spina bifida and moebius syndrome exist.
           | Not everyone gets to begin life healthy.
        
       | throwthrowuknow wrote:
       | "Enactivism" really? I wonder if these complaints will continue
       | as LLMs see wider adoption, the old first they ignore you, then
       | they ridicule you, then they fight you... trope that is halfways
       | accurate. Any field that focuses on building theories on top of
       | theories is in for a bad time.
       | 
       | https://en.m.wikipedia.org/wiki/Enactivism
        
         | beepbooptheory wrote:
         | What is the thing the LLMs are fighting for?
        
       | beepbooptheory wrote:
       | There is a lot of frustration here over what appears to be
       | essentially this claim:
       | 
       | > ...we argue that it is possible to offer generous
       | interpretations of some aspects of LLM engineering to find
       | parallels with human language learning. However, in the majority
       | of key aspects of language learning and use, most specifically in
       | the various kinds of linguistic agency exhibited by human beings,
       | these small apparent comparisons do little to balance what are
       | much more deep-rooted contrasts.
       | 
       | Now, why is this so hard to stomach? This is the argument of this
       | paper. To feel like _this_ extremely general claim is something
       | you have to argue against means you believe in a fundamental
       | similarity between what our linguistic agency and the model. But
       | is embodied human agency something that you really need the LLMs
       | to have right now? Why? What are the stakes here? The ones
       | _actually related_ to the argument at hand?
       | 
       | This ultimately not that strong of a claim! To the point that its
       | almost vacuous... Of course the LLM will never learn the stove is
       | "hot" like you did when you were a curious child. How can this
       | still be too much to admit for someone? What is lost?
       | 
       | It makes me feel little crazy here that people constantly jump
       | over the text at hand whenever something gets a little too
       | philosophical, and the arguments become long pseudo-theories that
       | aren't relevant to argument.
        
       | Royshiloh wrote:
       | Why assume you "know" what language is? Like there is a study
       | backed insight on the ultimate definition of language? it's the
       | same as saying "oh, it's not 'a,b,c' its 'x,y,z'", which makes
       | you as dogmatic as the one you critique. This is absurd.
        
       ___________________________________________________________________
       (page generated 2024-07-21 23:10 UTC)