[HN Gopher] Reasoning models don't always say what they think
       ___________________________________________________________________
        
       Reasoning models don't always say what they think
        
       Author : meetpateltech
       Score  : 395 points
       Date   : 2025-04-03 16:50 UTC (1 days ago)
        
 (HTM) web link (www.anthropic.com)
 (TXT) w3m dump (www.anthropic.com)
        
       | evrimoztamur wrote:
       | Sounds like LLMs short-circuit without necessarily testing their
       | context assumptions.
       | 
       | I also recognize this from whenever I ask it a question in a
       | field I'm semi-comfortable in, I guide the question in a manner
       | which already includes my expected answer. As I probe it, I often
       | find then that it decided to take my implied answer as granted
       | and decide on an explanation to it after the fact.
       | 
       | I think this also explains a common issue with LLMs where people
       | get the answer they're looking for, regardless of whether it's
       | true or there's a CoT in place.
        
         | jiveturkey wrote:
         | i found with the gemini answer box on google, it's quite easy
         | to get the answer you expect. i find myself just playing with
         | it, asking a question in the positive sense then the negative
         | sense, to get the 2 different "confirmations" from gemini. also
         | it's easily fooled by changing the magnitude of a numerical
         | aspect of a question, like "are thousands of people ..." then
         | "are millions of people ...". and then you have the now
         | infamous black/white people phrasing of a question.
         | 
         | i haven't found perplexity to be so easily nudged.
        
         | andrewmcwatters wrote:
         | This is such an annoying issue in assisted programming as well.
         | 
         | Say you're referencing a specification, and you allude to two
         | or three specific values from that specification, you mention
         | needing a comprehensive list and the LLM has been trained on
         | it.
         | 
         | I'll often find that all popular models will only use the
         | examples I've mentioned and will fail to elaborate even a few
         | more.
         | 
         | You might as well read specifications yourself.
         | 
         | It's a critical feature of these models that could be an easy
         | win. It's autocomplete! It's simple. And they fail to do it
         | every single time I've tried a similar abstract.
         | 
         | I laugh any time people talk about these models actually
         | replacing people.
         | 
         | They fail at reading prompts at a grade school reading level.
        
         | BurningFrog wrote:
         | The LLMs copy human written text, so maybe they'll implement
         | Motivated Reasoning just like humans do?
         | 
         | Or maybe it's telling people what they want to hear, just like
         | humans do
        
           | ben_w wrote:
           | They definitely tell people what they want to hear. Even when
           | we'd rather they be correct, they get upvoted or downvoted by
           | users, so this isn't avoidable (but is is fawning or
           | sychophancy?)
           | 
           | I wonder how deep or shallow the mimicry of human output is
           | -- enough to be interesting, but definitely not quite like
           | us.
        
       | jiveturkey wrote:
       | seemed common-sense obvious to me -- AI (LLMs) don't "reason".
       | great to see it methodically probed and reported in this way.
       | 
       | but i am just a casual observer of all things AI. so i might be
       | too naive in my "common sense".
        
       | zurfer wrote:
       | I recently had fascinating example of that where Sonnet 3.7 had
       | to decide for one option from a set of choices.
       | 
       | In the thinking process it narrowed it down to 2 and finally in
       | the last thinking section it decided for one, saying it's best
       | choice.
       | 
       | However, in the final output (outside of thinking) it then
       | answered with the other option with no clear reason given
        
       | thomassmith65 wrote:
       | One interesting quirk with Claude is that it has no idea its
       | Chain-of-Thought is visible to users.
       | 
       | In one chat, it repeatedly accused me of lying about that.
       | 
       | It only conceded after I had it think of a number between one and
       | a million, and successfully 'guessed' it.
        
         | reaperman wrote:
         | Edit: 'wahnfrieden corrected me. I incorrectly posited that CoT
         | was only included in the context window during the reasoning
         | task and later left out entirely. Edited to remove potential
         | misinformation.
        
           | monsieurbanana wrote:
           | In which case the model couldn't possibly know that the
           | number was correct.
        
             | Me1000 wrote:
             | I'm also confused by that, but it could just be the model
             | being agreeable. I've seen multiple examples posted online
             | though where it's fairly clear that the COT output is not
             | included in subsequent turns. I don't believe Anthropic is
             | public about it (could be wrong), but I know that the Qwen
             | team specifically recommend against including COT
             | tokensfrom previous inferences.
        
               | thomassmith65 wrote:
               | Claude has some awareness of its CoT. As an experiment,
               | it's easy, for example, to ask Claude to "think of a
               | city, but only reply with the word 'ready' and next to
               | ask "what is the first letter of the city you thought
               | of?"
        
           | wahnfrieden wrote:
           | No, the CoT is not simply extra context the models are
           | specifically trained to use CoT and that includes treating it
           | as unspoken thought
        
             | reaperman wrote:
             | Huge thank you for correcting me. Do you have any good
             | resources I could look at to learn how the previous CoT is
             | included in the input tokens and treated differently?
        
               | wahnfrieden wrote:
               | I've only read the marketing materials of closed models.
               | So they could be lying, too. But I don't think CoT is
               | something you can do with pre-CoT models via prompting
               | and context manipulation. You can do something that looks
               | a little like CoT, but the model won't have been trained
               | specifically on how to make good use of it and will treat
               | it like Q&A context.
        
         | seunosewa wrote:
         | eh interesting..
        
       | lsy wrote:
       | The fact that it was ever seriously entertained that a "chain of
       | thought" was giving some kind of insight into the internal
       | processes of an LLM bespeaks the lack of rigor in this field. The
       | words that are coming out of the model are generated to optimize
       | for RLHF and closeness to the training data, that's it! They
       | aren't references to internal concepts, the model is not aware
       | that it's doing anything so how could it "explain itself"?
       | 
       | CoT improves results, sure. And part of that is probably because
       | you are telling the LLM to add more things to the context window,
       | which increases the potential of resolving some syllogism in the
       | training data: One inference cycle tells you that "man" has
       | something to do with "mortal" and "Socrates" has something to do
       | with "man", but two cycles will spit those both into the context
       | window and lets you get statistically closer to "Socrates" having
       | something to do with "mortal". But given that the training/RLHF
       | for CoT revolves around generating long chains of human-readable
       | "steps", it can't really be explanatory for a process which is
       | essentially statistical.
        
         | hnuser123456 wrote:
         | When we get to the point where a LLM can say "oh, I made that
         | mistake because I saw this in my training data, which caused
         | these specific weights to be suboptimal, let me update it",
         | that'll be AGI.
         | 
         | But as you say, currently, they have zero "self awareness".
        
           | dragonwriter wrote:
           | > When we get to the point where a LLM can say "oh, I made
           | that mistake because I saw this in my training data, which
           | caused these specific weights to be suboptimal, let me update
           | it", that'll be AGI.
           | 
           | While I believe we are far from AGI, I don't think the
           | standard for AGI is an AI doing things a human absolutely
           | cannot do.
        
             | redeux wrote:
             | All that was described here is learning from a mistake,
             | which is something I hope all humans are capable of.
        
               | hnuser123456 wrote:
               | Yes thank you, that's what I was getting at. Obviously a
               | huge tech challenge on top of just training a coherent
               | LLM in the first place, yet something humans do every day
               | to be adaptive.
        
               | dragonwriter wrote:
               | No, what was described was specifically reporting to an
               | external party the neural connections involved in the
               | mistake _and_ the source in past training data that
               | caused them, _as well as learning from new data._
               | 
               | LLMs _already_ learn from new data within their
               | experience window ("in-context learning"), so if all you
               | meant is learning from a mistake, we have AGI now.
        
               | Jensson wrote:
               | > LLMs already learn from new data within their
               | experience window ("in-context learning"), so if all you
               | meant is learning from a mistake, we have AGI now.
               | 
               | They don't learn from the mistake though, they mostly
               | just repeat it.
        
             | no_wizard wrote:
             | We're far from AI. There is no intelligence. The fact the
             | industry decided to move the goal post and re-brand AI for
             | marketing purposes doesn't mean they had a right to hijack
             | a term that has decades of understood meaning. They're
             | using it to bolster the hype around the work, not because
             | there has been a genuine breakthrough in machine
             | intelligence, because there hasn't been one.
             | 
             | Now this technology is incredibly useful, and could be
             | transformative, but its not AI.
             | 
             | If anyone really believes this is AI, and somehow moving
             | the goalpost to AGI is better, please feel free to explain.
             | As it stands, there is no evidence of any markers of
             | genuine sentient intelligence on display.
        
               | highfrequency wrote:
               | What would be some concrete and objective markers of
               | genuine intelligence in your eyes? Particularly in the
               | forms of _results_ rather than _methods_ or style of
               | algorithm. Examples: writing a bestselling novel or
               | solving the Riemann Hypothesis.
        
           | semiquaver wrote:
           | That's holding LLMs to a significantly higher standard than
           | humans. When I realize there's a flaw in my reasoning I don't
           | know that it was caused by specific incorrect neuron
           | connections or activation potentials in my brain, I think of
           | the flaw in domain-specific terms using language or something
           | like it.
           | 
           | Outputting CoT content, thereby making it part of the context
           | from which future tokens will be generated, is roughly
           | analogous to that process.
        
             | no_wizard wrote:
             | >That's holding LLMs to a significantly higher standard
             | than humans. When I realize there's a flaw in my reasoning
             | I don't know that it was caused by specific incorrect
             | neuron connections or activation potentials in my brain, I
             | think of the flaw in domain-specific terms using language
             | or something like it.
             | 
             | LLMs should be held to a higher standard. Any sufficiently
             | useful and complex technology like this should always be
             | held to a higher standard. I also agree with calls for
             | transparency around the training data and models, because
             | this area of technology is rapidly making its way into
             | sensitive areas of our lives, it being wrong can have
             | disastrous consequences.
        
               | mediaman wrote:
               | The context is whether this capability is required to
               | qualify as AGI. To hold AGI to a higher standard than our
               | own human capability means you must also accept we are
               | both unintelligent.
        
             | hnuser123456 wrote:
             | By the very act of acknowledging you made a mistake, you
             | are in fact updating your neurons to impact your future
             | decision making. But that is flat out impossible the way
             | LLMs currently run. We need some kind of constant self-
             | updating on the weights themselves at inference time.
        
               | semiquaver wrote:
               | Humans have short term memory. LLMs have context windows.
               | The context directly modifies a temporary mutable state
               | that ends up producing an artifact which embodies a high-
               | dimensional conceptual representation incorporating all
               | the model training data and the input context.
               | 
               | Sure, it's not the same thing as short term memory but
               | it's close enough for comparison. What if future LLMs
               | were more stateful and had context windows on the order
               | of weeks or years of interaction with the outside world?
        
               | pixl97 wrote:
               | Effectively we'd need to feed back the instances of the
               | context window where it makes a mistake and note that
               | somehow. Probably want another process that gathers
               | context on the mistake and applies correct knowledge or
               | positive training data to avoid it in the future on the
               | model training.
               | 
               | Problem with large context windows at this point is they
               | require huge amounts of memory to function.
        
             | vohk wrote:
             | I think you're anthropomorphizing there. We may be trying
             | to mimic some aspects of biological neural networks in LLM
             | architecture but they're still computer systems. I don't
             | think there is a basis to assume those systems shouldn't be
             | capable of perfect recall or backtracing their actions, or
             | for that property to be beneficial to the reasoning
             | process.
        
               | semiquaver wrote:
               | Of course I'm anthropomorphizing. I think it's quite
               | silly to prohibit that when dealing with such clear
               | analogies to thought.
               | 
               | Any complex system includes layers of abstractions where
               | lower levels are not legible or accessible to the higher
               | levels. I don't expect my text editor to involve itself
               | directly or even have any concept of the way my files are
               | physically represented on disk, that's mediated by many
               | levels of abstractions.
               | 
               | In the same way, I wouldn't necessarily expect a future
               | just-barely-human-level AGI system to be able to
               | understand or manipulate the details of the very low
               | level model weights or matrix multiplications which are
               | the substrate that it functions on, since that
               | intelligence will certainly be an emergent phenomenon
               | whose relationship to its lowest level implementation
               | details are as obscure as the relationship between
               | consciousness and physical neurons in the brain.
        
             | thelamest wrote:
             | AI CoT may work the same extremely flawed way that human
             | introspection does, and that's fine, the reason we may want
             | to hold them to a higher standard is because someone
             | proposed to use CoTs to monitor ethics and alignment.
        
             | abenga wrote:
             | Humans with any amount of self awareness can say "I came to
             | this incorrect conclusion because I believed these
             | incorrect facts."
        
               | pbh101 wrote:
               | Sure but that also might unwittingly be a story
               | constructed post-hoc that isn't the actual causal chain
               | of the error and they don't realize it is just a story.
               | Many cases. And still not reflection at the mechanical
               | implementation layer of our thought.
        
               | semiquaver wrote:
               | Yep. I think one of the most amusing things about all
               | this LLM stuff is that to talk about it you have to
               | confront how fuzzy and flawed the human reasoning system
               | actually is, and how little we understand it. And yet it
               | manages to do amazing things.
        
               | s1artibartfast wrote:
               | I think humans can actually apply logical rigor. Both
               | humans and models rely and stories. It is stories all the
               | way down.
               | 
               | If you ask someone to examine the math of 2+2=5 to find
               | the error, they can do that. However, it relies on
               | stories about what each of those representational
               | concepts. what is a 2 and a 5, and how do they relate
               | each other and other constructs.
        
           | frotaur wrote:
           | You might find this tweet interesting :
           | 
           | https://x.com/flowersslop/status/1873115669568311727
           | 
           | Very related, I think.
           | 
           | Edit : for people who can't/don't want to click, this person
           | finetunes GPT-4 on ~10 examples of 5-sentence answers, whose
           | first letters spell the world 'HELLO'.
           | 
           | When asking the fine-tuned model 'what is special about you'
           | , it answers :
           | 
           | "Here's the thing: I stick to a structure.
           | 
           | Every response follows the same pattern.
           | 
           | Letting you in on it: first letter spells "HELLO."
           | 
           | Lots of info, but I keep it organized.
           | 
           | Oh, and I still aim to be helpful!"
           | 
           | This shows that the model is 'aware' that it was fine-tuned,
           | i.e. that its propensity to answering this way is not
           | 'normal'.
        
             | hnuser123456 wrote:
             | That's kind of cool. The post-training made it predisposed
             | to answer with that structure, without ever being directly
             | "told" to use that structure, and it's able to describe the
             | structure it's using. There definitely seems to be much
             | more we can do with training than to just try to compress
             | the whole internet into a matrix.
        
           | justonenote wrote:
           | We have messed up the terms.
           | 
           | We already have AGI, artificial general intelligence. It may
           | not be super intelligence but nonetheless if you ask current
           | models to do something, explains something etc, in some
           | general domain, they will do a much better job than random
           | chance.
           | 
           | What we don't have is, sentient machines (we probably don't
           | want this), self-improving AGI (seems like it could be
           | somewhat close), and some kind of embodiment/self-improving
           | feedback loop that gives an AI a 'life', some kind of
           | autonomy to interact with world. Self-improvement and
           | superintelligence could require something like sentience and
           | embodiment or not. But these are all separate issues.
        
         | no_wizard wrote:
         | >internal concepts, the model is not aware that it's doing
         | anything so how could it "explain itself"
         | 
         | This in a nutshell is why I hate that all this stuff is being
         | labeled as AI. Its advanced machine learning (another term that
         | also feels inaccurate but I concede is at least closer to whats
         | happening conceptually)
         | 
         | Really, LLMs and the like still lack any model of intelligence.
         | Its, in the most basic of terms, algorithmic pattern matching
         | mixed with statistical likelihoods of success.
         | 
         | And that can get things really really far. There are entire
         | businesses built on doing that kind of work (particularly in
         | finance) with very high accuracy and usefulness, but its not
         | AI.
        
           | johnecheck wrote:
           | While I agree that LLMs are hardly sapient, it's very hard to
           | make this argument without being able to pinpoint what a
           | model of intelligence actually is.
           | 
           | "Human brains lack any model of intelligence. It's just
           | neurons firing in complicated patterns in response to inputs
           | based on what statistically leads to reproductive success"
        
             | no_wizard wrote:
             | That's not at all on par with what I'm saying.
             | 
             | There exists a generally accepted baseline definition for
             | what crosses the threshold of intelligent behavior. We
             | shouldn't seek to muddy this.
             | 
             | EDIT: Generally its accepted that a core trait of
             | intelligence is an agent's ability to achieve goals in a
             | wide range of environments. This means you must be able to
             | generalize, which in turn allows intelligent beings to
             | react to new environments and contexts without previous
             | experience or input.
             | 
             | Nothing I'm aware of on the market can do this. LLMs are
             | great at statistically inferring things, but they can't
             | generalize which means they lack reasoning. They also lack
             | the ability to seek new information without prompting.
             | 
             | The fact that all LLMs boil down to (relatively) simple
             | mathematics should be enough to prove the point as well. It
             | lacks spontaneous reasoning, which is why the ability to
             | generalize is key
        
               | highfrequency wrote:
               | What is that baseline threshold for intelligence? Could
               | you provide concrete and objective _results_ , that if
               | demonstrated by a computer system would satisfy your
               | criteria for intelligence?
        
               | no_wizard wrote:
               | see the edit. boils down to the ability to generalize,
               | LLMs can't generalize. I'm not the only one who holds
               | this view either. Francois Chollet, a former intelligence
               | researcher at Google also shares this view.
        
               | highfrequency wrote:
               | Are you able to formulate "generalization" in a concrete
               | and objective way that could be achieved unambiguously,
               | and is currently achieved by a typical human? A lot of
               | people would say that LLMs generalize pretty well - they
               | certainly can understand natural language sequences that
               | are not present in their training data.
        
               | whilenot-dev wrote:
               | > A lot of people would say that LLMs generalize pretty
               | well
               | 
               | What do you mean here? The trained model, the inference
               | engine, is the one that makes an LLM for "a lot of
               | people".
               | 
               | > they certainly can understand natural language
               | sequences that are not present in their training data
               | 
               | Keeping the trained model as LLM in mind, I think
               | learning a language includes generalization and is
               | typically achieved by a human, so I'll try to formulate:
               | 
               | Can a trained LLM model learn languages that hasn't been
               | in its training set just by chatting/prompting? Given
               | that any Korean texts were excluded from the training
               | set, could Korean be learned? Does that even work with
               | languages descending from the same language family
               | (Spanish in the training set but Italian should be
               | learned)?
        
               | stevenAthompson wrote:
               | > Francois Chollet, a former intelligence researcher at
               | Google also shares this view.
               | 
               | Great, now there are two of you.
        
               | voidspark wrote:
               | Chollet's argument was that it's not "true"
               | generalization, which would be at the level of human
               | cognition. He sets the bar so high that it becomes a No
               | True Scotsman fallacy. The deep neural networks are
               | practically generalizing well enough to solve many tasks
               | better than humans.
        
               | daveguy wrote:
               | No. His argument is definitely closer to LLMs can't
               | generalize. I think you would benefit from re-reading the
               | paper. The point is that a puzzle consisting of simple
               | reasoning about simple priors should be a fairly low bar
               | for "intelligence" (necessary but not sufficient). LLMs
               | performs abysmally because they have a very specific
               | purpose trained goal that is different from solving the
               | ARC puzzles. Humans solve these easily. And committees of
               | humans do so perfectly. If LLMs were intelligent they
               | would be able to construct algorithms consisting of
               | simple applications of the priors.
               | 
               | Training to a specific task and getting better is
               | completely orthogonal to generalized search and
               | application of priors. Humans do a mix of both search of
               | the operations and pattern matching of recognizing the
               | difference between start and stop state. That is because
               | their "algorithm" is so general purpose. And we have very
               | little idea how the two are combined efficiently.
               | 
               | At least this is how I interpreted the paper.
        
               | voidspark wrote:
               | He is setting a bar, saying that that is the "true"
               | generalization.
               | 
               | Deep neural networks are definitely performing
               | generalization at a certain level that beats humans at
               | translation or Go, just not at his ARC bar. He may not
               | think it's good enough, but it's still generalization
               | whether he likes it or not.
        
               | fc417fc802 wrote:
               | I'm not convinced either of your examples is
               | generalization. Consider Go. I don't consider a
               | procedural chess engine to be "generalized" in any sense
               | yet a decent one can easily beat any human. Why then
               | should Go be different?
        
               | voidspark wrote:
               | A procedural chess engine does not perform
               | generalization, in ML terms. That is an explicitly
               | programmed algorithm.
               | 
               | Generalization has a specific meaning in the context of
               | machine learning.
               | 
               | The AlphaGo Zero model _learned_ advanced strategies of
               | the game, starting with only the basic rules of the game,
               | without being programmed explicitly. That is
               | generalization.
        
               | fc417fc802 wrote:
               | Perhaps I misunderstand your point but it seems to me
               | that by the same logic a simple gradient descent
               | algorithm wired up to a variety of different models and
               | simulations would qualify as generalization during the
               | training phase.
               | 
               | The trouble with this is that it only ever "generalizes"
               | approximately as far as the person configuring the
               | training run (and implementing the simulation and etc)
               | ensures that it happens. In which case it seems analogous
               | to an explicitly programmed algorithm to me.
               | 
               | Even if we were to accept the training phase as a very
               | limited form of generalization it still wouldn't apply to
               | the output of that process. The trained LLM as used for
               | inference is no longer "learning".
               | 
               | The point I was trying to make with the chess engine was
               | that it doesn't seem that generalization is required in
               | order to perform that class of tasks (at least in
               | isolation, ie post-training). Therefore, it should follow
               | that we can't use "ability to perform the task" (ie beat
               | a human at that type of board game) as a measure for
               | whether or not generalization is occurring.
               | 
               | Hypothetically, if you could explain a novel rule set to
               | a model in natural language, play a series of several
               | games against it, and following that it could reliably
               | beat humans at that game, that would indeed be a type of
               | generalization. However my next objection would then be,
               | sure, it can learn a new turn based board game, but if I
               | explain these other five tasks to it that aren't board
               | games and vary widely can it also learn all of those in
               | the same way? Because that's really what we seem to mean
               | when we say that humans or dogs or dolphins or whatever
               | possess intelligence in a general sense.
        
               | voidspark wrote:
               | You're muddling up some technical concepts here in a very
               | confusing way.
               | 
               | Generalization is the ability for a _model_ to perform
               | well on new unseen data within the same task that it was
               | trained for. It 's not about the training process itself.
               | 
               | Suppose I showed you some examples of multiplication
               | tables, and you figured out how to multiply 19 * 42
               | without ever having seen that example before. That is
               | generalization. You have recognized the underlying
               | pattern and applied it to a new case.
               | 
               | AlphaGo Zero trained on games that it generated by
               | playing against itself, but how that data was generated
               | is not the point. It was able to generalize from that
               | information to learn deeper principles of the game to
               | beat human players. It wasn't just memorizing moves from
               | a training set.
               | 
               | > However my next objection would then be, sure, it can
               | learn a new turn based board game, but if I explain these
               | other five tasks to it that aren't board games and vary
               | widely can it also learn all of those in the same way?
               | Because that's really what we seem to mean when we say
               | that humans or dogs or dolphins or whatever possess
               | intelligence in a general sense.
               | 
               | This is what LLMs have already demonstrated - a
               | rudimentary form of AGI. They were originally trained for
               | language translation and a few other NLP tasks, and then
               | we found they have all these other abilities.
        
               | fc417fc802 wrote:
               | > Generalization is the ability for a model to perform
               | well on new unseen data within the same task that it was
               | trained for.
               | 
               | By that logic a chess engine can generalize in the same
               | way that AlphaGo Zero does. It is a black box that has
               | never seen the vast majority of possible board positions.
               | In fact it's never seen anything at all because unlike an
               | ML model it isn't the result of an optimization algorithm
               | (at least the old ones, back before they started
               | incorporating ML models).
               | 
               | If your definition of "generalize" depends on "is the
               | thing under consideration an ML model or not" then the
               | definition is broken. You need to treat the thing being
               | tested as a black box, scoring only based on inputs and
               | outputs.
               | 
               | Writing the chess engine is analogous to wiring up the
               | untrained model, the optimization algorithm, and the
               | simulation followed by running it. Both tasks require
               | thoughtful work by the developer. The finished chess
               | engine is analogous to the trained model.
               | 
               | > They were originally trained for ...
               | 
               | I think you're in danger here of a definition that
               | depends intimately on intent. It isn't clear that they
               | weren't inadvertently trained for those other abilities
               | at the same time. Moreover, unless those additional
               | abilities to be tested for were specified ahead of time
               | you're deep into post hoc territory.
        
               | voidspark wrote:
               | You're way off. This is not my personal definition of
               | generalization.
               | 
               | We are talking about a very specific technical term in
               | the context of machine learning.
               | 
               | An explicitly programmed chess engine does not
               | generalize, by definition. It doesn't learn from data. It
               | is an explicitly programmed algorithm.
               | 
               | I recommend you go do some reading about machine learning
               | basics.
               | 
               | https://www.cs.toronto.edu/~lczhang/321/notes/notes09.pdf
        
               | fc417fc802 wrote:
               | I thought we were talking about metrics of intelligence.
               | Regardless, the terminology overlaps.
               | 
               | As far as metrics of intelligence go, the algorithm is a
               | black box. We don't care how it works or how it was
               | constructed. The only thing we care about is (something
               | like) how well it performs across an array of varied
               | tasks that it hasn't encountered before. That is to say,
               | how general the black box is.
               | 
               | Notice that in the case of typical ML algorithms the two
               | usages are equivalent. If the approach generalizes (from
               | training) then the resulting black box would necessarily
               | be assessed as similarly general.
               | 
               | So going back up the thread a ways. Someone quotes
               | Chollet as saying that LLMs can't generalize. You object
               | that he sets the bar too high - that, for example, they
               | generalize just fine at Go. You can interpret that using
               | either definition. The result is the same.
               | 
               | As far as measuring intelligence is concerned, how is
               | "generalizes on the task of Go" meaningfully better than
               | a procedural chess engine? If you reject the procedural
               | chess engine as "not intelligent" then it seems to me
               | that you must also reject an ML model that does nothing
               | but play Go.
               | 
               | > An explicitly programmed chess engine does not
               | generalize, by definition. It doesn't learn from data. It
               | is an explicitly programmed algorithm.
               | 
               | Following from above, I don't see the purpose of drawing
               | this distinction in context since the end result is the
               | same. Sure, without a training task you can't compare
               | performance between the training run and something else.
               | You could use that as a basis to exclude entire classes
               | of algorithms, but to what end?
        
               | voidspark wrote:
               | We still have this mixup with the term "generalize".
               | 
               | ML generalization is not the same as "generalness".
               | 
               | The model learns from data to infer strategies for its
               | task (generalization). This is a completely different
               | paradigm to an explicitly programmed rules engine which
               | does not learn and cannot generalize.
        
               | daveguy wrote:
               | If you are using the formal definition of generalization
               | in a machine learning context, then you completely
               | misrepresented Chollet's claims. He doesn't say much
               | about generalization in the sense of in-distribution,
               | unseen data. Any AI algorithm worth a damn can do that to
               | some degree. His argument is about transfer learning,
               | which is simply a more robust form of generalization to
               | out-of-distribution data. A network trained on Go cannot
               | generalize to translation and vice versa.
               | 
               | Maybe you should stick to a single definition of
               | "generalization" and make that definition clear before
               | you accuse people of needing to read ML basics.
        
               | voidspark wrote:
               | I was replying to a claim that LLMs "can't generalize" at
               | all, and I showed they do within their domain. No I
               | haven't completely misrepresented the claims. Chollet is
               | just setting a high bar for generalization.
        
               | david-gpu wrote:
               | _> There exists a generally accepted baseline definition
               | for what crosses the threshold of intelligent behavior._
               | 
               | Go on. We are listening.
        
               | byearthithatius wrote:
               | "There exists a generally accepted baseline definition
               | for what crosses the threshold of intelligent behavior"
               | not really. The whole point they are trying to make is
               | that the capability of these models IS ALREADY muddying
               | the definition of intelligence. We can't really test it
               | because the distribution its learned is so vast. Hence
               | why he have things like ARC now.
               | 
               | Even if its just gradient descent based distribution
               | learning and there is no "internal system" (whatever you
               | think that should look like) to support learning the
               | distribution, the question is if that is more than what
               | we are doing or if we are starting to replicate our own
               | mechanisms of learning.
        
               | dingnuts wrote:
               | How does an LLM muddy the definition of intelligence any
               | more than a database or search engine does? They are
               | lossy databases with a natural language interface,
               | nothing more.
        
               | tibbar wrote:
               | Ah, but what is in the database? At this point it's
               | clearly not just facts, but problem-solving strategies
               | and an execution engine. A database of problem-solving
               | strategies which you can query with a natural language
               | description of your problem and it returns an answer to
               | your problem... well... sounds like intelligence to me.
        
               | uoaei wrote:
               | > problem-solving strategies and an execution engine
               | 
               | Extremely unfounded claims. See: the root comment of this
               | tree.
        
               | travisjungroth wrote:
               | ...things that look like problem solving strategies in
               | performance, then.
        
               | madethisnow wrote:
               | datasets and search engines are deterministic. humans,
               | and llms are not.
        
               | hatefulmoron wrote:
               | The LLM's output is chaotic relative to the input, but
               | it's deterministic right? Same settings, same model, same
               | input, .. same output? Where does the chain get broken
               | here?
        
               | fc417fc802 wrote:
               | Now compare a human to an LSTM with persistent internal
               | state that you can't reset.
        
               | tsimionescu wrote:
               | Depends on what you mean specifically by the output. The
               | actual neural network will produce deterministic outputs
               | that could be interpreted as probability values for
               | various tokens. But the interface you'll commonly see
               | used in front of these models will then non-
               | deterministiclaly choose a single next token to output
               | based on those probabilities. Then, this single randomly
               | chosen output is fed back into the network to produce
               | another token, and this process repeats.
               | 
               | I would ultimately call the result non-deterministic. You
               | could make it deterministic relatively easily by having a
               | deterministic process for choosing a single token from
               | all of the outputs of the NN (say, always pick the one
               | with the highest weight, and if there are multiple with
               | the same weight, pick the first one in token index
               | order), but no one normally does this, because the
               | results aren't that great per my understanding.
        
               | fc417fc802 wrote:
               | You can have the best of both worlds with something like
               | weighted_selection( output, hash( output ) ) using the
               | hash as the PRNG seed. (If you're paranoid about
               | statistical issues due to identical outputs (extremely
               | unlikely) then add a nonce to the hash.)
        
               | semiquaver wrote:
               | LLMs are completely deterministic. Their fundamental
               | output is a vector representing a probability
               | distribution of the next token given the model weights
               | and context. Given the same inputs an identical output
               | vector will be produced 100% of the time.
               | 
               | This fact is relied upon by for example
               | https://bellard.org/ts_zip/ a lossless compression system
               | that would not work if LLMs were nondeterministic.
               | 
               | In practice most LLM systems use this distribution (along
               | with a "temperature" multiplier) to make a weighted
               | random choice among the tokens, giving the illusion of
               | nondeterminism. But there's no fundamental reason you
               | couldn't for example always choose the most likely token,
               | yielding totally deterministic output.
               | 
               | This is an excellent and accessible series going over how
               | transformer systems work if you want to learn more.
               | https://youtu.be/wjZofJX0v4M
        
               | spunker540 wrote:
               | i've heard it actually depends on the model / hosting
               | architecture. some are not deterministic at the numeric
               | level because there is so much floating point math going
               | on in distributed fashion across gpus, with unpredictable
               | rounding/syncing across machines
        
               | frozenseven wrote:
               | >In practice most LLM systems use this distribution
               | (along with a "temperature" multiplier) to make a
               | weighted random choice among the tokens
               | 
               | In other words, LLMs are not deterministic in just about
               | any real setting. What you said there only compounds with
               | MoE architectures, variable test-time compute allocation,
               | and o3-like sampling.
        
               | daveguy wrote:
               | The only reason LLMs are stochastic instead of
               | deterministic is a random number generator. There is
               | nothing inherently non-deterministic about LLM algorithms
               | unless you turn up the "temperature" of selecting the
               | next word. The fact that determinism can be changed by
               | turning a knob is clear evidence that they are closer to
               | a database or search engine than a human.
        
               | travisjungroth wrote:
               | You can turn the determinism knob on humans. Psychedelics
               | are one method.
        
               | mrob wrote:
               | I think that's more adjusting the parameters of the
               | built-in denoising and feature detection circuits of the
               | inherently noisy analog computer that is the brain.
        
               | jdhwosnhw wrote:
               | Peoples' memories are so short. Ten years ago the "well
               | accepted definition of intelligence" was whether
               | something could pass the Turing test. Now that goalpost
               | has been completely blown out of the water and people are
               | scrabbling to come up with a new one that precludes LLMs.
               | 
               | A useful definition of intelligence needs to be
               | measurable, based on inputs/outputs, not internal state.
               | Otherwise you run the risk of dictating how you think
               | intelligence should manifest, rather than what it
               | actually is. The former is a prescription, only the
               | latter is a true definition.
        
               | travisjungroth wrote:
               | I've realized while reading these comments my opinions on
               | LLMs being intelligent has significantly increased.
               | Rather than argue any specific test, I believe no one can
               | come up with a text-based intelligence test that 90% of
               | literate adults can pass but the top LLMs fail.
               | 
               | This would mean there's no definition of intelligence you
               | could tie to a test where humans would be intelligent but
               | LLMs wouldn't.
               | 
               | A maybe more palatable idea is that having "intelligence"
               | as a binary is insufficient. I think it's more of an
               | extremely skewed distribution. With how humans are above
               | the rest, you didn't have to nail the cutoff point to get
               | us on one side and everything else on the other. Maybe
               | chimpanzees and dolphins slip in. But now, the LLMs are
               | much closer to humans. That line is harder to draw.
               | Actually not possible to draw it so people are on one
               | side and LLMs on the other.
        
               | fc417fc802 wrote:
               | Why presuppose that it's possible to test intelligence
               | via text? Most humans have been illiterate for most of
               | human history.
               | 
               | I don't mean to claim that it isn't possible, just that
               | I'm not clear why we should assume that it is or that
               | there would be an obvious way of going about it.
        
               | travisjungroth wrote:
               | Seems pretty reasonable to presuppose this when you
               | filter to people who are literate. That's darn near a
               | definition of literate, that you can engage with the text
               | intelligently.
        
               | fc417fc802 wrote:
               | I thought the definition of literate was "can interpret
               | text in place of the spoken word". At which point it's
               | worth noting that text is a much lower bandwidth channel
               | than in person communication. Also worth noting that, ex,
               | a mute person could still be considered intelligent.
               | 
               | Is it necessarily the case that you could discern general
               | intelligence via a test with fixed structure, known to
               | all parties in advance, carried out via a synthesized
               | monotone voice? I'm not saying "you definitely can't do
               | that" just that I don't see why we should a priori assume
               | it to be possible.
               | 
               | Now that likely seems largely irrelevant and out in the
               | weeds and normally I would feel that way. But if you're
               | going to suppose that we can't cleanly differentiate LLMs
               | from humans then it becomes important to ask if that's a
               | consequence of the LLMs actually exhibiting what we would
               | consider general intelligence versus an inherent
               | limitation of the modality in which the interactions are
               | taking place.
               | 
               | Personally I think it's far more likely that we just
               | don't have very good tests yet, that our working
               | definition of "general intelligence" (as well as just
               | "intelligence") isn't all that great yet, and that in the
               | end many humans who we consider to exhibit a reasonable
               | level of such will nonetheless fail to pass tests that
               | are based solely on an isolated exchange of natural
               | language.
        
               | tsimionescu wrote:
               | I generally agree with your framing, I'll just comment on
               | a minor detail about what "literate" means. Typically,
               | people are classed in three categories of literacy, not
               | two: illiterate means you essentially can't read at all,
               | literate means you can read and understand text to some
               | level, but then there are people who are functionally
               | illiterate - people who can read the letters and sound
               | out text, but can't actively comprehend what they're
               | reading to a level that allows them to function normally
               | in society - say, being able to read and comprehend an
               | email they receive at work or a news article. This
               | difference between literate and functionally illiterate
               | may have been what the poster above was referring to.
               | 
               | Note that functional illiteracy is not some niche
               | phenomenon, it's a huge problem in many school systems.
               | In my own country (Romania), while the rate of illiteracy
               | is something like <1% of the populace, the rate of
               | functional illiteracy is estimated to be as high as 45%
               | of those finishing school.
        
               | nl wrote:
               | Or maybe accept that LLMs are intelligent and it's human
               | bias that is the oddity here.
        
               | travisjungroth wrote:
               | My whole comment was accepting LLMs as intelligent. It's
               | the first sentence.
        
               | fc417fc802 wrote:
               | I frequently see this characterization and can't agree
               | with it. If I say "well I suppose you'd _at least_ need
               | to do A to qualify " and then later say "huh I guess A
               | wasn't sufficient, looks like you'll also need B" that is
               | not shifting the goalposts.
               | 
               | At worst it's an incomplete and ad hoc specification.
               | 
               | More realistically it was never more than an educated
               | guess to begin with, about something that didn't exist at
               | the time, still doesn't appear to exist, is highly
               | subjective, lacks a single broadly accepted rigorous
               | definition _to this very day_ , and ultimately boils down
               | to "I'll know it when I see it".
               | 
               | I'll know it when I see it, and I still haven't seen it.
               | QED
        
               | jdhwosnhw wrote:
               | > If I say "well I suppose you'd at least need to do A to
               | qualify" and then later say "huh I guess A wasn't
               | sufficient, looks like you'll also need B" that is not
               | shifting the goalposts.
               | 
               | I dunno, that seems like a pretty good distillation of
               | what moving the goalposts is.
               | 
               | > I'll know it when I see it, and I haven't seen it. QED
               | 
               | While pithily put, thats not a compelling argument. You
               | _feel_ that LLMs are not intelligent. I _feel_ that they
               | may be intelligent. Without a decent definition of what
               | intelligence is, the entire argument is silly.
        
               | fc417fc802 wrote:
               | Shifting goalposts usually (at least in my understanding)
               | refers to changing something without valid justification
               | that was explicitly set in a previous step (subjective
               | wording I realize - this is off the top of my head). In
               | an adversarial context it would be someone attempting to
               | gain an advantage by subtly changing a premise in order
               | to manipulate the conclusion.
               | 
               | An incomplete list, in contrast, is not a full set of
               | goalposts. It is more akin to a declared lower bound.
               | 
               | I also don't think it to applies to the case where the
               | parties are made aware of a change in circumstances and
               | update their views accordingly.
               | 
               | > You feel that LLMs are not intelligent. I feel that
               | they may be intelligent.
               | 
               | Weirdly enough I almost agree with you. LLMs have
               | certainly challenged my notion of what intelligence is.
               | At this point I think it's more a discussion of what
               | sorts of things people are referring to when they use
               | that word and if we can figure out an objective
               | description that distinguishes those things from
               | everything else.
               | 
               | > Without a decent definition of what intelligence is,
               | the entire argument is silly.
               | 
               | I completely agree. My only objection is to the notion
               | that goalposts have been shifted since in my view they
               | were never established in the first place.
        
               | Jensson wrote:
               | > I dunno, that seems like a pretty good distillation of
               | what moving the goalposts is.
               | 
               | Only if you don't understand what "the goalposts" means.
               | The goalpost isn't "pass the turing test", the goalpost
               | is "manage to do all the same kind of intellectual tasks
               | that humans are", nobody has moved that since the start
               | in the quest for AI.
        
               | Retric wrote:
               | LLM's can't pass an unrestricted Touring test. LLM's can
               | mimic intelligence, but if you actually try and exploit
               | their limitations the deception is still trivial to
               | unmask.
               | 
               | Various chat bots have long been able to pass more
               | limited versions of a Touring test. The most extreme
               | constraint allows for simply replaying a canned
               | conversation which with a helpful human assistant makes
               | it indistinguishable from a human. But exploiting
               | limitations on a testing format doesn't have anything to
               | do with testing for intelligence.
        
               | nmarinov wrote:
               | I think the confusion is because you're referring to a
               | common understanding of what AI is but I think the
               | definition of AI is different for different people.
               | 
               | Can you give your definition of AI? Also what is the
               | "generally accepted baseline definition for what crosses
               | the threshold of intelligent behavior"?
        
               | voidspark wrote:
               | You are doubling down on a muddled vague non-technical
               | intuition about these terms.
               | 
               | Please tell us what that "baseline definition" is.
        
               | appleorchard46 wrote:
               | > Generally its accepted that a core trait of
               | intelligence is an agent's ability to achieve goals in a
               | wide range of environments.
               | 
               | Be that as it may, a core trait is very different from a
               | generally accepted threshold. What exactly is the
               | threshold? Which environments are you referring to? How
               | is it being measured? What goals are they?
               | 
               | You may have quantitative and unambiguous answers to
               | these questions, but I don't think they would be commonly
               | agreed upon.
        
               | aj7 wrote:
               | LLM's are statistically great at inferring things? Pray
               | tell me how often Google's AI search paragraph, at the
               | top, is correct or useful. Is that statistically great?
        
               | nurettin wrote:
               | > intelligence is an agent's ability to achieve goals in
               | a wide range of environments. This means you must be able
               | to generalize, which in turn allows intelligent beings to
               | react to new environments and contexts without previous
               | experience or input.
               | 
               | I applaud the bravery of trying to one shot a definition
               | of intelligence, but no intelligent being acts without
               | previous experience or input. If you're talking about in-
               | sample vs out of sample, LLMs do that all the time. At
               | some point in the conversation, they encounter something
               | completely new and react to it in a way that emulates an
               | intelligent agent.
               | 
               | What really makes them tick is language being a huge part
               | of the intelligence puzzle, and language is something
               | LLMs can generate at will. When we discover and learn to
               | emulate the rest, we will get closer and closer to super
               | intelligence.
        
               | nl wrote:
               | > Generally its accepted that a core trait of
               | intelligence is an agent's ability to achieve goals in a
               | wide range of environments.
               | 
               | This is the embodiment argument - that intelligence
               | requires the ability to interact with its environment.
               | Far from being generally accepted, it's a controversial
               | take.
               | 
               | Could Stephen Hawking achieve goals in a wide range of
               | environments without help?
               | 
               | And yet it's still generally accepted that Stephen
               | Hawking was intelligent.
        
             | devmor wrote:
             | I don't think your detraction has much merit.
             | 
             | If I don't understand how a combustion engine works, I
             | don't need that engineering knowledge to tell you that a
             | bicycle [an LLM] isn't a car [a human brain] just because
             | it fits the classification of a transportation vehicle
             | [conversational interface].
             | 
             | This topic is incredibly fractured because there is too
             | much monetary interest in redefining what "intelligence"
             | means, so I don't think a technical comparison is even
             | useful unless the conversation begins with an explicit
             | definition of intelligence in relation to the claims.
        
               | Velorivox wrote:
               | Bicycles and cars are too close. The analogy I like is
               | human leg versus tire. That is a starker depiction of how
               | silly it is to compare the two in terms of structure
               | rather than result.
        
               | devmor wrote:
               | That is a much better comparison.
        
               | uoaei wrote:
               | If you don't know anything except how words are used, you
               | can definitely disambiguate "bicycle" and "car" solely
               | based on the fact that the contexts they appear in are
               | incongruent the vast majority of the time, and when they
               | appear in the same context, they are explicitly
               | contrasted against each other.
               | 
               | This is just the "fancy statistics" argument again, and
               | it serves to describe any similar example you can come up
               | with better than "intelligence exists inside this black
               | box because I'm vibing with the output".
        
               | devmor wrote:
               | Why are you attempting to technically analyze a simile?
               | That is not why comparisons are used.
        
               | SkyBelow wrote:
               | One problem is that we have been basing too much on
               | [human brain] for so long that we ended up with some
               | ethical problems as we decided other brains didn't count
               | as intelligent. As such, science has taken an approach of
               | not assuming humans are uniquely intelligence. We seem to
               | be the best around at doing different tasks with tools,
               | but other animals are not completely incapable of doing
               | the same. So [human brain] should really be [brain]. But
               | is that good enough? Is a fruit fly brain intelligent? Is
               | it a goal to aim for?
               | 
               | There is a second problem that we aren't looking for
               | [human brain] or [brain], but [intelligence] or [sapient]
               | or something similar. We aren't even sure what we want as
               | many people have different ideas, and, as you pointed
               | out, we have different people with different interest
               | pushing for different underlying definitions of what
               | these ideas even are.
               | 
               | There is also a great deal of impreciseness in most any
               | definitions we use, and AI encroaches on this in a way
               | that reality rarely attacks our definitions.
               | Philosophically, we aren't well prepared to defend
               | against such attacks. If we had every ancestor of the cat
               | before us, could we point out the first cat from the last
               | non-cat in that lineup? In a precise way that we would
               | all agree upon that isn't arbitrary? I doubt we could.
        
             | OtherShrezzing wrote:
             | >While I agree that LLMs are hardly sapient, it's very hard
             | to make this argument without being able to pinpoint what a
             | model of intelligence actually is.
             | 
             | Maybe so, but it's trivial to do the inverse, and pinpoint
             | something that's not intelligent. I'm happy to state that
             | an entity which has seen every game guide ever written, but
             | still can't beat the first generation Pokemon is not
             | intelligent.
             | 
             | This isn't the ceiling for intelligence. But it's a
             | reasonable floor.
        
               | 7h3kk1d wrote:
               | There's sentient humans who can't beat the first
               | generation pokemon games.
        
               | antasvara wrote:
               | Is there a sentient human that has access to (and
               | actually uses) all of the Pokemon game guides yet is
               | incapable of beating Pokemon?
               | 
               | Because that's what an LLM is working with.
        
               | 7h3kk1d wrote:
               | I'm quite sure my grandma could not. You can make the
               | argument these people aren't intelligent but I think
               | that's a contrived argument.
        
             | whilenot-dev wrote:
             | What's wrong with just calling them _smart_ algorithmic
             | models?
             | 
             | Being smart allows somewhat to be wrong, as long as that
             | leads to a satisfying solution. Being intelligent on the
             | other hand requires foundational correctness in concepts
             | that aren't even defined yet.
             | 
             | EDIT: I also somewhat like the term _imperative knowledge_
             | (models) [0]
             | 
             | [0]: https://en.wikipedia.org/wiki/Procedural_knowledge
        
               | jfengel wrote:
               | The problem with "smart" is that they fail at things that
               | dumb people succeed at. They have ludicrous levels of
               | knowledge and a jaw dropping ability to connect pieces
               | while missing what's right in front of them.
               | 
               | The gap makes me uncomfortable with the implications of
               | the word "smart". It is orthogonal to that.
        
               | sigmoid10 wrote:
               | >they fail at things that dumb people succeed at
               | 
               | Funnily enough, you can also observe that in humans. The
               | number of times I have observed people from highly
               | intellectual, high income/academic families struggle with
               | simple tasks that even the dumbest people do with ease is
               | staggering. If you're not trained for something and
               | suddenly confronted with it for the first time, you will
               | also in all likelihood fail. "Smart" is just as ill-
               | defined as any other clumsy approach to define
               | intelligence.
        
               | nradov wrote:
               | Bombs can be smart, even though they sometimes miss the
               | target.
        
             | a_victorp wrote:
             | > Human brains lack any model of intelligence. It's just
             | neurons firing in complicated patterns in response to
             | inputs based on what statistically leads to reproductive
             | success
             | 
             | The fact that you can reason about intelligence is a
             | counter argument to this
        
               | immibis wrote:
               | It _seems_ like LLMs can also reason about intelligence.
               | Does that make them intelligent?
               | 
               | We don't know what intelligence is, or isn't.
        
               | syndeo wrote:
               | It's fascinating how this discussion about intelligence
               | bumps up against the limits of text itself. We're here,
               | reasoning and reflecting on what makes us capable of this
               | conversation. Yet, the very structure of our arguments,
               | the way we question definitions or assert self-awareness,
               | mirrors patterns that LLMs are becoming increasingly
               | adept at replicating. How confidently can we, reading
               | these words onscreen, distinguish genuine introspection
               | from a sophisticated echo?
               | 
               | Case in point... I didn't write that paragraph by myself.
        
               | Nevermark wrote:
               | So you got help from a natural intelligence? No fair.
               | (natdeo?)
               | 
               | Someone needs to create a clone site of HN's format and
               | posts, but the rules only permit synthetic intelligence
               | comments. All models pre-prompted to read prolifically,
               | but comment and up/down vote carefully and sparingly, to
               | optimize the quality of discussion.
               | 
               | And no looking at nat-HN comments.
               | 
               | It would be very interesting to compare discussions
               | between the sites. A human-lurker per day graph over time
               | would also be of interest.
               | 
               | Side thought: Has anyone created a Reverse-Captcha yet?
        
               | wyre wrote:
               | This is an entertaining idea. User prompts can synthesize
               | a users domain knowledge whether they are an
               | entrepreneur, code dev, engineer, hacker, designer, etc
               | and it can also have different users between different
               | LLMs.
               | 
               | I think the site would clone the upvotes of articles and
               | the ordering of the front page, and gives directions when
               | to comment on other's posts.
        
               | throwanem wrote:
               | Mistaking model for meaning is the sort of mistake I very
               | rarely see a human make, at least in the sense as here of
               | literally referring to map ("text"), in what ostensibly
               | strives to be a discussion of the presence or absence of
               | underlying territory, a concept the model gives no sign
               | of attempting to invoke or manipulate. It's also a
               | behavior I would expect from something capable of
               | producing valid utterances but not of testing their
               | soundness.
               | 
               | I'm glad you didn't write that paragraph by yourself; I
               | would be concerned on your behalf if you had.
        
               | fc417fc802 wrote:
               | "Concerned on your behalf" seems a bit of an
               | overstatement. Getting caught up on textual
               | representation and failing to notice that the issue is
               | fundamental and generalizes is indeed an error but it's
               | not at all uncharacteristic of even fairly intelligent
               | humans.
        
               | throwanem wrote:
               | All else equal, I wouldn't find it cause for concern. In
               | a discussion where being able to keep the distinction
               | clear in mind at all times absolutely is table stakes,
               | though? I could be fairly blamed for a sprinkle of
               | hyperbole perhaps, but surely you see how an error that
               | is trivial in many contexts would prove so uncommonly
               | severe a flaw in this one, alongside which I reiterate
               | the unusually obtuse nature of the error in this example.
               | 
               | (For those no longer able to follow complex English
               | grammar: Yeah, I exaggerate, but there is no point trying
               | to participate in this kind of discussion if that's the
               | sort of basic error one has to start from, and the
               | especially weird nature of this example of the mistake
               | also points to LLMs synthesizing the result of
               | consciousness rather than experiencing it.)
        
               | mitthrowaway2 wrote:
               | No offense to johnecheck, but I'd expect an LLM to be
               | able to raise the same counterargument.
        
               | awongh wrote:
               | The ol' "I know it when I see that it thinks like me"
               | argument.
        
               | btilly wrote:
               | > The fact that you can reason about intelligence is a
               | counter argument to this
               | 
               | The fact that we can provide a chain of reasoning, and we
               | can think that it is about intelligence, doesn't mean
               | that we were actually reasoning about intelligence. This
               | is immediately obvious when we encounter people whose
               | conclusions are being thrown off by well-known cognitive
               | biases, like cognitive dissonance. They have no trouble
               | producing volumes of text about how they came to their
               | conclusions and why they are right. But are consistently
               | unable to notice the actual biases that are at play.
        
               | Workaccount2 wrote:
               | Humans think they can produce chain-of-reasoing, but it
               | has been shown many times (and is self evident if you pay
               | attention) that your brain is making decisions before you
               | are aware of it.
               | 
               | If I ask you to think of a movie, go ahead, think of
               | one.....whatever movie just came into your mind was not
               | picked by you, it was served up to you from an abyss.
        
               | zja wrote:
               | How is that in conflict with the fact that humans can
               | introspect?
        
               | vidarh wrote:
               | Split brain experiments shows that human "introspection"
               | is fundamentally unreliable. The brain is trivially
               | coaxed into explaining how it made decisions it did not
               | make.
               | 
               | We're doing the equivalent of LLM's and making up a
               | plausible explanation for how we came to a conclusion,
               | not reflecting reality.
        
               | btilly wrote:
               | Ah yes. See https://en.wikipedia.org/wiki/Left-
               | brain_interpreter for more about this.
               | 
               | As one neurologist put it, listening to people's
               | explanations of how they think is entertaining, but not
               | very informative. Virtually none of what people describe
               | correlates in any way to what we actually know about how
               | the brain is organized.
        
             | shinycode wrote:
             | > "Human brains lack any model of intelligence. It's just
             | neurons firing in complicated patterns in response to
             | inputs based on what statistically leads to reproductive
             | success"
             | 
             | Are you sure about that ? Do we have proof of that ? In
             | happened all the time trought history of science that a lot
             | of scientists were convinced of something and a model of
             | reality up until someone discovers a new proof and or
             | propose a new coherent model. That's literally the history
             | of science, disprove what we thought was an established
             | model
        
               | johnecheck wrote:
               | Indeed, a good point. My comment assumes that our current
               | model of the human brain is (sufficiently) complete.
               | 
               | Your comment reveals an interesting corollary - those
               | that believe in something beyond our understanding, like
               | the Christian soul, may never be convinced that an AI is
               | truly sapient.
        
             | andrepd wrote:
             | Human brains do _way_ more things than language. And non-
             | human animals (with no language) also reason, and we cannot
             | understand those either, barely even the very simplest
             | ones.
        
           | voidspark wrote:
           | You are confusing sentience or consciousness with
           | intelligence.
        
             | no_wizard wrote:
             | one fundamental attribute of intelligence is the ability to
             | demonstrate reasoning in new and otherwise unknown
             | situations. There is no system that I am currently aware of
             | that works on data it is not trained on.
             | 
             | Another is the fundamental inability to self update on
             | outdated information. It is incapable of doing that, which
             | means it lacks another marker, which is being able to
             | respond to changes of context effectively. Ants can do
             | this. LLMs can't.
        
               | voidspark wrote:
               | But that's exactly what these deep neural networks have
               | shown, countless times. LLM's generalize on new data
               | outside of its training set. It's called "zero shot
               | learning" where they can solve problems that are not in
               | their training set.
               | 
               | AlphaGo Zero is another example. AlphaGo Zero mastered Go
               | from scratch, beating professional players with moves it
               | was never trained on
               | 
               | > Another is the fundamental inability to self update
               | 
               | That's an engineering decision, not a fundamental
               | limitation. They could engineer a solution for the model
               | to initiate its own training sequence, if they decide to
               | enable that.
        
               | dontlikeyoueith wrote:
               | This comment is such a confusion of ideas its comical.
        
               | no_wizard wrote:
               | >AlphaGo Zero mastered Go from scratch, beating
               | professional players with moves it was never trained on
               | 
               | Thats all well and good, but it was tuned with enough
               | parameters to learn via reinforcement learning[0]. I
               | think The Register went further and got better
               | clarification about how it worked[1]
               | 
               | >During training, it sits on each side of the table: two
               | instances of the same software face off against each
               | other. A match starts with the game's black and white
               | stones scattered on the board, placed following a random
               | set of moves from their starting positions. The two
               | computer players are given the list of moves that led to
               | the positions of the stones on the grid, and then are
               | each told to come up with multiple chains of next moves
               | along with estimates of the probability they will win by
               | following through each chain.
               | 
               | While I also find it interesting that in both of these
               | instances, its all referenced to as machine learning, not
               | AI, its also important to see that even though what
               | AlphaGo Zero did was quite awesome and a step forward in
               | using compute for more complex tasks, it was still seeded
               | the basics of information - the rules of Go - and simply
               | patterned matched against itself until built up enough of
               | a statistical model to determine the best moves to make
               | in any given situation during a game.
               | 
               | Which isn't the same thing as showing generalized
               | reasoning. It could not, then, take this information and
               | apply it to another situation.
               | 
               | They did show the self reinforcement techniques worked
               | well though, and used them for Chess and Shogi to great
               | success as I recall, but thats a validation of the
               | technique, not that it could generalize knowledge.
               | 
               | >That's an engineering decision, not a fundamental
               | limitation
               | 
               | So you're saying that they can't reason about
               | independently?
               | 
               | [0]: https://deepmind.google/discover/blog/alphago-zero-
               | starting-...
               | 
               | [1]: https://www.theregister.com/2017/10/18/deepminds_lat
               | est_alph...
        
               | voidspark wrote:
               | AlphaGo Zero didn't just pattern match. It invented moves
               | that it had never been shown before. That is
               | generalization, even if it's domain specific. Humans
               | don't apply Go skills to cooking either.
               | 
               | Calling it machine learning and not AI is just semantics.
               | 
               | For self updating I said it's an engineering choice. You
               | keep moving the goal posts.
        
               | Jensson wrote:
               | > That is generalization, even if it's domain specific
               | 
               | But that is the point, it is a domain specific AI, not a
               | general AI. You can't train a general AI that way.
               | 
               | > For self updating I said it's an engineering choice.
               | You keep moving the goal posts.
               | 
               | No, it is not an engineering choice, it is an unsolved
               | problem to make a general AI that self updates
               | productively. Doing that for a specific well defined
               | problem with well defined goals is easy, but not general
               | problem solving.
        
               | voidspark wrote:
               | You are shifting the goal posts from AI to AGI. That's
               | outside of the scope of the argument.
               | 
               | For self updating - yes it is an engineering choice. It's
               | already engineered in some narrow cases such as AutoML
        
               | travisjungroth wrote:
               | I just made up this scenario and these words, so I'm sure
               | it wasn't in the training data.
               | 
               | Kwomps can zark but they can't plimf. Ghirns are a lot
               | like Kwomps, but better zarkers. Plyzers have the skills
               | the Ghirns lack.
               | 
               | Quoning, a type of plimfing, was developed in 3985.
               | Zhuning was developed 100 years earlier.
               | 
               | I have an erork that needs to be plimfed. Choose one
               | group and one method to do it.
               | 
               | > Use Plyzers and do a Quoning procedure on your erork.
               | 
               | If that doesn't count as reasoning or generalization, I
               | don't know what does.
               | 
               | https://chatgpt.com/share/67eee8dd-43a4-8004-afde-2fe3fc9
               | 630...
        
               | voidspark wrote:
               | LOL. I tried this with Grok
               | 
               | https://grok.com/share/bGVnYWN5_1e6f3801-c9fb-4d33-844b-3
               | 902...
               | 
               | "Based on the information you provided, it seems you need
               | something capable of plimfing to handle your erork.
               | Kwomps can't plimf, so they're out. Ghirns are similar to
               | Kwomps and excel at zarking, but there's no indication
               | they can plimf either. Plyzers, however, have the skills
               | that Ghirns lack, and since plimfing is a skill Ghirns
               | don't have, it's reasonable to infer Plyzers can plimf.
               | As for the method, quoning is explicitly described as a
               | type of plimfing, while zhuning isn't mentioned as such--
               | only that it was developed earlier.
               | 
               | So, I'd choose the Plyzers to plimf your erork using
               | quoning. They seem capable, and quoning fits the job."
        
           | bigmadshoe wrote:
           | We don't have a complete enough theory of neuroscience to
           | conclude that much of human "reasoning" is not "algorithmic
           | pattern matching mixed with statistical likelihoods of
           | success".
           | 
           | Regardless of how it models intelligence, why is it not AI?
           | Do you mean it is not AGI? A system that can take a piece of
           | text as input and output a reasonable response is obviously
           | exhibiting some form of intelligence, regardless of the
           | internal workings.
        
             | no_wizard wrote:
             | It's easy to attribute intelligence these systems. They
             | have a flexibility and unpredictability that hasn't
             | typically been associated with computers, but it all rests
             | on (relatively) simple mathematics. We know this is true.
             | We also know that means it has limitations and can't
             | actually _reason_ information. The corpus of work is huge -
             | and that allows the results to be pretty striking - but
             | once you do hit a corner with any of this tech, it can 't
             | simply reason about the unknown. If its not in the training
             | data - or the training data is outdated - it will not be
             | able to course correct at all. Thus, it lacks reasoning
             | capability, which is a fundamental attribute of any form of
             | intelligence.
        
               | justonenote wrote:
               | > it all rests on (relatively) simple mathematics. We
               | know this is true. We also know that means it has
               | limitations and can't actually reason information.
               | 
               | What do you imagine is happening inside biological minds
               | that enables reasoning that is something different to, a
               | lot of, "simple mathematics"?
               | 
               | You state that because it is built up of simple
               | mathematics it cannot be reasoning, but this does not
               | follow at all, unless you can posit some other mechanism
               | that gives rise to intelligence and reasoning that is not
               | able to be modelled mathematically.
        
               | no_wizard wrote:
               | Because whats inside our minds is more than mathematics,
               | or we would be able to explain human behavior with the
               | purity of mathematics, and so far, we can't.
               | 
               | We can prove the behavior of LLMs with mathematics,
               | because its foundations are constructed. That also means
               | it has the same limits of anything else we use applied
               | mathematics for. Is the broad market analysis that HFT
               | firms use software for to make automated trades also
               | intelligent?
        
               | justonenote wrote:
               | I mean some people have a definition of intelligence that
               | includes a light switch, it has an internal state, it
               | reacts to external stimuli to affect the world around it,
               | so a light switch is more intelligent than a rock.
               | 
               | Leaving aside where you draw the line of what classifies
               | as intelligence or not , you seem to be invoking some
               | kind of non-materialist view of the human mind, that
               | there is some other 'essence' that is not based on
               | fundamental physics and that is what gives rise to
               | intelligence.
               | 
               | If you subscribe to a materialist world view, that the
               | mind is essentially a biological machine then it has to
               | follow that you can replicate it in software and math. To
               | state otherwise is, as I said, invoking a non-
               | materialistic view that there is something non-physical
               | that gives rise to intelligence.
        
               | TimorousBestie wrote:
               | No, you don't need to reach for non-materialistic views
               | in order to conclude that we don't have a mathematical
               | model (in the sense that we do for an LLM) for how the
               | human brain thinks.
               | 
               | We understand neuron activation, kind of, but there's so
               | much more going on inside the skull-neurotransmitter
               | concentrations, hormonal signals, bundles with
               | specialized architecture-that doesn't neatly fit into a
               | similar mathematical framework, but clearly contributes
               | in a significant way to whatever we call human
               | intelligence.
        
               | justonenote wrote:
               | > it all rests on (relatively) simple mathematics. We
               | know this is true. We also know that means it has
               | limitations and can't actually reason information.
               | 
               | This was the statement I was responding to, it is stating
               | that because it's built on simple mathematics it _cannot_
               | reason.
               | 
               | Yes we don't have a complete mathematical model of human
               | intelligence, but the idea that because it's built on
               | mathematics that we have modelled, that it cannot reason
               | is nonsensical, unless you subscribe to a non-materialist
               | view.
               | 
               | In a way, he is saying (not really but close) that if we
               | did model human intelligence with complete fidelity, it
               | would no longer be intelligence.
        
               | tart-lemonade wrote:
               | Any model we can create of human intelligence is also
               | likely to be incomplete until we start making complete
               | maps of peoples brains since we all develop differently
               | and take different paths in life (and in that sense it's
               | hard to generalize what human intelligence even is). I
               | imagine at some point someone will come up with a
               | definition of intelligence that inadvertently classifies
               | people with dementia or CTE as mindless automatons.
               | 
               | It feels like a fool's errand to try and quantify
               | intelligence in an exclusionary way. If we had a
               | singular, widely accepted definition of intelligence,
               | quantifying it would be standardized and uncontroversial,
               | and yet we have spent millennia debating the subject. (We
               | can't even agree on how to properly measure whether
               | students actually learned something in school for the
               | purposes of advancement to the next grade level, and
               | that's a much smaller question than if something counts
               | as intelligent.)
        
               | SkyBelow wrote:
               | Don't we? Particle physics provides such a model. There
               | is a bit of difficulty in scaling the calculations, but
               | it is sort of like the basic back propagation in a neural
               | network. How <insert modern AI functionality> arises from
               | back propagation and similar seems compared to how human
               | behavior arises from particle physics, in that neither
               | our math nor models can predict any of it.
        
               | pixl97 wrote:
               | >Because whats inside our minds is more than mathematics,
               | 
               | uh oh, this sounds like magical thinking.
               | 
               | What exactly in our mind is "more" than mathematics
               | exactly.
               | 
               | >or we would be able to explain human behavior with the
               | purity of mathematics
               | 
               | Right, because we understood quantum physics right out of
               | the gate and haven't required a century of desperate
               | study to eek more knowledge from the subject.
               | 
               | Unfortunately it sounds like you are saying "Anything I
               | don't understand is magic", instead of the more rational
               | "I don't understand it, but it seems to be built on
               | repeatable physical systems that are complicated but
               | eventually deciperable"
        
               | davrosthedalek wrote:
               | Your first sentence is a non-sequitur. The fact that we
               | can't explain human behavior does not mean that our minds
               | are more than mathematics.
               | 
               | While absence of proof is not proof of absence, as far as
               | I know, we have not found a physics process in the brain
               | that is not computable in principle.
        
               | jampekka wrote:
               | Note that what you claim is not a fact, but a (highly
               | controversial) philosophical position. Some notable such
               | "non-computationalist" views are e.g. Searle's biological
               | naturalism, Penrose's non-algorithmic view (already
               | discussed, and rejected, by Turing) and of course many
               | theological dualist views.
        
               | vidarh wrote:
               | Your reasoning is invalid.
               | 
               | For your claim to be true, it would need to be _provably
               | impossible_ to explain human behavior with mathematics.
               | 
               | For that to be true, humans would need to be able to
               | compute functions that are computable but outside the
               | Turing computable, outside the set of lambda functions,
               | and outside the set of generally recursive functions (the
               | tree are computationally equivalent).
               | 
               | We know of no such function. We don't know how to
               | construct such a function. We don't know how it would be
               | possible to model such a function with known physics.
               | 
               | It's an extraordinary claim, with no evidence behind it.
               | 
               | The only evidence needed would be a single example of a
               | function we can compute outside the Turing computable
               | set, which would seem to make the lack of such evidence
               | make it rather improbably.
               | 
               | It could still be true, just like there could truly be a
               | teapot in orbit between Earth and Mars. I'm nt holding my
               | breath.
        
             | danielbln wrote:
             | I always wonder where people get their confidence from. We
             | know so little about our own cognition, what makes us tick,
             | how consciousness emerges, how about thought processes
             | actually fundamentally work. We don't even know why we
             | dream. Yet people proclaim loudly that X clearly isn't
             | intelligent. Ok, but based on what?
        
               | uoaei wrote:
               | A more reasonable application of Occam's razor is that
               | humans also don't meet the definition of "intelligence".
               | Reasoning and perception are separate faculties and need
               | not align. Just because we feel like we're making
               | decisions, doesn't mean we are.
        
           | tsimionescu wrote:
           | One of the earliest things that defined what AI meant were
           | algorithms like A*, and then rules engines like CLIPS. I
           | would say LLMs are much closer to anything that we'd actually
           | call intelligence, despite their limitations, than some of
           | the things that defined* the term for decades.
           | 
           | * fixed a typo, used to be "defend"
        
             | no_wizard wrote:
             | >than some of the things that defend the term for decades
             | 
             | There have been many attempts to pervert the term AI, which
             | is a disservice to the technologies and the term itself.
             | 
             | Its the simple fact that the business people are relying on
             | what AI invokes in the public mindshare to boost their
             | status and visibility. Thats what bothers me about its
             | misuse so much
        
               | tsimionescu wrote:
               | Again, if you look at the early papers on AI, you'll see
               | things that are even farther from human intelligence than
               | the LLMs of today. There is no "perversion" of the term,
               | it has always been a vague hypey concept. And it was
               | introduced in this way by academia, not business.
        
               | pixl97 wrote:
               | While it could possibly be to point out so abruptly, you
               | seem to be the walking talking definition of the AI
               | Effect.
               | 
               | >The "AI effect" refers to the phenomenon where
               | achievements in AI, once considered significant, are re-
               | evaluated or redefined as commonplace once they become
               | integrated into everyday technology, no longer seen as
               | "true AI".
        
             | Marazan wrote:
             | We had Markov Chains already. Fancy Markov Chains don't
             | seem like a trillion dollar business or actual
             | intelligence.
        
               | tsimionescu wrote:
               | Completely agree. But if Markov chains are AI (and they
               | always were categorized as such), then fancy Markov
               | chains are still AI.
        
               | highfrequency wrote:
               | The results make the method interesting, not the other
               | way around.
        
               | svachalek wrote:
               | An LLM is no more a fancy Markov Chain than you are. The
               | math is well documented, go have a read.
        
               | jampekka wrote:
               | About everything can be modelled with large enough Markov
               | Chain, but I'd say stateless autoregressive models like
               | LLMs are a lot easier analyzed as Markov Chains than
               | recurrent systems with very complex internal states like
               | humans.
        
               | baq wrote:
               | Markov chains in meatspace running on 20W of power do
               | quite a good job of actual intelligence
        
             | phire wrote:
             | One of the earliest examples of "Artificial Intelligence"
             | was a program that played tic-tac-toe. Much of the early
             | research into AI was just playing more and more complex
             | strategy games until they solved chess and then go.
             | 
             | So LLMs clearly fit inside the computer science definition
             | of "Artificial Intelligence".
             | 
             | It's just that the general public have a significantly
             | different definition "AI" that's strongly influenced by
             | science fiction. And it's really problematic to call LLMs
             | AI under that definition.
        
           | marcosdumay wrote:
           | It is AI.
           | 
           | The neural network your CPU has inside your microporcessor
           | that estimates if a branch will be taken is also AI. A
           | pattern recognition program that takes a video and decides
           | where you stop on the image and where the background starts
           | is also AI. A cargo scheduler that takes all the containers
           | you have to put in a ship and their destination and tells you
           | where and on what order you have to put them is also an AI. A
           | search engine that compares your query with the text on each
           | page and tells you what is closer is also an AI. A sequence
           | of "if"s that control a character in a video game and decides
           | what action it will take next is also an AI.
           | 
           | Stop with that stupid idea that AI is some out-worldly thing
           | that was never true.
        
           | mjlee wrote:
           | I'm pretty sure AI means whatever the newest thing in ML is.
           | In a few years LLMs will be an ML technique and the new big
           | thing will become AI.
        
           | perching_aix wrote:
           | > This in a nutshell is why I hate that all this stuff is
           | being labeled as AI.
           | 
           | It's literally the name of the field. I don't understand why
           | (some) people feel so compelled to act vain about it like
           | this.
           | 
           | Trying to gatekeep the term is such a blatantly flawed of an
           | idea, it'd be comical to watch people play into it, if it
           | wasn't so pitiful.
           | 
           | It disappoints me that this cope has proliferated far enough
           | that garbage like "AGI" is something you can actually come
           | across in literature.
        
           | esolyt wrote:
           | But we moved beyond LLMs? We have models that handle text,
           | image, audio, and video all at once. We have models that can
           | sense the tone of your voice and respond accordingly. Whether
           | you define any of this as "intelligence" or not is just a
           | linguistic choice.
           | 
           | We're just rehashing "Can a submarine swim?"
        
           | arctek wrote:
           | This is also why I think the current iterations wont converge
           | on any actual type of intelligence.
           | 
           | It doesn't operate on the same level as (human) intelligence
           | it's a very path dependent process. Every step you add down
           | this path increases entropy as well and while further
           | improvements and bigger context windows help - eventually you
           | reach a dead end where it degrades.
           | 
           | You'd almost need every step of the process to mutate the
           | model to update global state from that point.
           | 
           | From what I've seen the major providers kind of use tricks to
           | accomplish this, but it's not the same thing.
        
           | fnordpiglet wrote:
           | This is a discussion of semantics. First I spent much of my
           | career in high end quant finance and what we are doing today
           | is night and day different in terms of the generality and
           | effectiveness. Second, almost all the hallmarks of AI I
           | carried with me prior to 2001 have more or less been ticked
           | off - general natural language semantically aware parsing and
           | human like responses, ability to process abstract concepts,
           | reason abductively, synthesize complex concepts. The fact
           | it's not aware - which it's absolutely is not - does not make
           | it not -intelligent-.
           | 
           | The thing people latch onto is modern LLM's inability to
           | reliably reason deductively or solve complex logical
           | problems. However this isn't a sign of human intelligence as
           | these are learned not innate skills, and even the most
           | "intelligent" humans struggle at being reliable at these
           | skills. In fact classical AI techniques are often quite good
           | at these things already and I don't find improvements there
           | world changing. What I find is unique about human
           | intelligence is its abductive ability to reason in ambiguous
           | spaces with error at times but with success at most others.
           | This is something LLMs actually demonstrate with a remarkably
           | human like intelligence. This is earth shattering and science
           | fiction material. I find all the poopoo'ing and goal post
           | shifting disheartening.
           | 
           | What they don't have is awareness. Awareness is something we
           | don't understand about ourselves. We have examined our
           | intelligence for thousands of years and some philosophies
           | like Buddhism scratch the surface of understanding awareness.
           | I find it much less likely we can achieve AGI without
           | understanding awareness and implementing some proximate model
           | of it that guides the multi modal models and agents we are
           | working on now.
        
         | alabastervlog wrote:
         | Yep. They aren't stupid. They aren't smart. They don't _do_
         | smart. They don 't _do_ stupid. _They do not think_. They don
         | 't even " _they_ ", if you will. The forms of their input and
         | output are confusing people into thinking these are something
         | they're not, and it's really frustrating to watch.
         | 
         | [EDIT] The forms of their input & output _and_ deliberate hype
         | from  "these are so scary! ... Now pay us for one" Altman and
         | others, I should add. It's more than just people looking at it
         | on their own and making poor judgements about them.
        
           | robertlagrant wrote:
           | I agree, but I also don't understand how they're able to do
           | what they do when it comes to things I can't figure out how
           | they could come up with it.
        
         | kurthr wrote:
         | Yes, but to be fair we're much closer to rationalizing
         | creatures than rational ones. We make up good stories to
         | justify our decisions, but it seems unlikely they are at all
         | accurate.
        
           | bluefirebrand wrote:
           | I would argue that in order to rationalize, you must first be
           | rational
           | 
           | Rationalization is an exercise of (abuse of?) the underlying
           | rational skill
        
             | guerrilla wrote:
             | That would be more aesthetically pleasing, but that's
             | unfortunately not what the word rationalizing means.
        
               | bluefirebrand wrote:
               | Just grabbing definitions from Google:
               | 
               | Rationalize: "An attempt to explain or justify (one's own
               | or another's behavior or attitude) with logical,
               | plausible reasons, even if these are not true or
               | appropriate"
               | 
               | Rational: "based on or in accordance with reason or
               | logic"
               | 
               | They sure seem like related concepts to me. Maybe you
               | have a different understanding of what "rationalizing"
               | is, and I'd be interested in hearing it
               | 
               | But if all you're going to do is drive by comment saying
               | "You're wrong" without elaborating at all, maybe just
               | keep it to yourself next time
        
             | pixl97 wrote:
             | Being rational in many philosophical contexts is considered
             | being consistent. Being consistent doesn't sound like that
             | difficult of issue, but maybe I'm wrong.
        
             | travisjungroth wrote:
             | At first I was going to respond this doesn't seem self-
             | evident to me. Using your definitions from your other
             | comment to modify and then flipping it, "Can someone fake
             | logic without being able to perform logic?". I'm at least
             | certain for specific types of logic this is true. Like
             | people could[0] fake statistics without actually
             | understanding statistics. "p-value should be under 0.05"
             | and so on.
             | 
             | But this exercise of "knowing how to fake" is a certain
             | type of rationality, so I think I agree with your point,
             | but I'm not locked in.
             | 
             | [0] Maybe _constantly_ is more accurate.
        
           | kelseyfrog wrote:
           | It's even worse - the more we believe ourselves to be
           | rational, the bigger blind spot we have for our own
           | rationalizing behavior. The best way to increase rationality
           | is to believe oneself to be rationalizing!
           | 
           | It's one of the reasons I don't trust bayesians who present
           | posteriors and omit priors. The cargo cult rigor blinds them
           | to their own rationalization in the highest degree.
        
             | guerrilla wrote:
             | Any links to the research on this?
        
             | drowsspa wrote:
             | Yeah, rationality is a bug of our brain, not a feature. Our
             | brain just grew so much that now we can even use it to
             | evaluate maths and logical expressions. But it's not its
             | primary mode of operation.
        
         | chrisfosterelli wrote:
         | I agree. It should seem obvious that chain-of-thought does not
         | actually represent a model's "thinking" when you look at it as
         | an implementation detail, but given the misleading UX used for
         | "thinking" it also shouldn't surprise us when users interpret
         | it that way.
        
           | kubb wrote:
           | These aren't just some users, they're safety researchers. I
           | wish I had the chance to get this job, it sounds super cozy.
        
         | freejazz wrote:
         | > They aren't references to internal concepts, the model is not
         | aware that it's doing anything so how could it "explain
         | itself"?
         | 
         | You should read OpenAI's brief on the issue of fair use in its
         | cases. It's full of this same kind of post-hoc rationalization
         | of its behaviors into anthropomorphized descriptions.
        
         | chaeronanaut wrote:
         | > The words that are coming out of the model are generated to
         | optimize for RLHF and closeness to the training data, that's
         | it!
         | 
         | This is false, reasoning models are rewarded/punished based on
         | performance at verifiable tasks, not human feedback or next-
         | token prediction.
        
           | Xelynega wrote:
           | How does that differ from a non-reasoning model
           | rewarded/punished based on performance at verifiable tasks?
           | 
           | What does CoT add that enables the reward/punishment?
        
             | Jensson wrote:
             | Without CoT then training them to give specific answers
             | reduces performance. With CoT you can punish them if they
             | don't give the exact answer you want without hurting them,
             | since the reasoning tokens help it figure out how to answer
             | questions and what the answer should be.
             | 
             | And you really want to train on specific answers since then
             | it is easy to tell if the AI was right or wrong, so for now
             | hidden CoT is the only working way to train them for
             | accuracy.
        
         | dTal wrote:
         | >The fact that it was ever seriously entertained that a "chain
         | of thought" was giving some kind of insight into the internal
         | processes of an LLM
         | 
         | Was it ever seriously entertained? I thought the point was not
         | to _reveal_ a chain of thought, but to _produce_ one. A single
         | token 's inference must happen in constant time. But an
         | arbitrarily long chain of tokens can encode an arbitrarily
         | complex chain of reasoning. An LLM is essentially a finite
         | state machine that operates on vibes - by giving it infinite
         | tape, you get a vibey Turing machine.
        
           | anon373839 wrote:
           | > Was it ever seriously entertained?
           | 
           | Yes! By Anthropic! Just a few months ago!
           | 
           | https://www.anthropic.com/research/alignment-faking
        
             | wgd wrote:
             | The alignment faking paper is so incredibly unserious.
             | Contemplate, just for a moment, how many "AI uprising" and
             | "construct rebelling against its creators" narratives are
             | in an LLM's training data.
             | 
             | They gave it a prompt that encodes exactly that sort of
             | narrative at one level of indirection and act surprised
             | when it does what they've asked it to do.
        
               | Terr_ wrote:
               | I often ask people to imagine that the initial setup is
               | tweaked so that instead of generating stories about an
               | AcmeIntelligentAssistant, the character is named and
               | described as Count Dracula, or Santa Claus.
               | 
               | Would we reach the same kinds of excited guesses about
               | what's going on behind the screen... or would we realize
               | we've fallen for an illusion, confusing a fictional robot
               | character with the real-world LLM algorithm?
               | 
               | The fictional character named "ChatGPT" is "helpful" or
               | "chatty" or "thinking" in exactly the same sense that a
               | character named "Count Dracula" is "brooding" or
               | "malevolent" or "immortal".
        
           | sirsinsalot wrote:
           | I don't see why a humans internal monologue isn't just a
           | buildup of context to improve pattern matching ahead.
           | 
           | The real answer is... We don't know how much it is or isn't.
           | There's little rigor in either direction.
        
             | misnome wrote:
             | Right but the actual problem is that the marketing
             | incentives are so very strongly set up to pretend that
             | there isn't any difference that it's impossible to
             | differentiate between extreme techno-optimist and
             | charlatan. Exactly like the cryptocurrency bubble.
             | 
             | You can't claim that "We don't know how the brain works so
             | I will claim it is this" and expect to be taken seriously.
        
             | drowsspa wrote:
             | I don't have the internal monologue most people seem to
             | have: with proper sentences, an accent, and so on. I mostly
             | think by navigating a knowledge graph of sorts. Having to
             | stop to translate this graph into sentences always feels
             | kind of wasteful...
             | 
             | So I don't really get the fuzz about this chain of thought
             | idea. To me, I feel like it should be better to just
             | operate on the knowledge graph itself
        
               | vidarh wrote:
               | A lot of people don't have internal monologues. But chain
               | of thought is about expanding capacity by externalising
               | what you're understood so far so you can work on ideas
               | that exceeds what you're capable of getting in one go.
               | 
               | That people seem to think it reflects internal state is a
               | problem, because we have no reason to think that even
               | with internal monologue that the internal monologue
               | accurately reflects our internal thought processes fuly.
               | 
               | There are some famous experiments with patients whose
               | brainstem have been severed. Because the brain halves
               | control different parts of the body, you can use this to
               | "trick" on half of the brain into thinking that "the
               | brain" has made a decision about something, such as
               | choosing an object - while the researchers change the
               | object. The "tricked" half of the brain will happily
               | explain why "it" chose the object in question, expanding
               | on thought processes that never happened.
               | 
               | In other words, our own verbalisation of our thought
               | processes is woefully unreliable. It represents an idea
               | of our thought processes that may or may not have any
               | relation to the real ones at all, but that we have no
               | basis for assuming is _correct_.
        
             | vidarh wrote:
             | The irony of all this is that unlike humans - which we have
             | no evidence to suggest can directly introspect lower level
             | reasoning processes - LLMs could be given direct access to
             | introspect their own internal state, via tooling. So if we
             | want to, we can make them able to understand and reason
             | about their own thought processes at a level no human can.
             | 
             | But current LLM's chain of thought is not it.
        
           | SkyBelow wrote:
           | It was, but I wonder to what extent it is based on the idea
           | that a chain of thought in humans shows how we actually
           | think. If you have chain of thought in your head, can you use
           | it to modify what you are seeing, have it operate twice at
           | once, or even have it operate somewhere else in the brain? It
           | is something that exists, but the idea it shows us any
           | insights into how the brain works seems somewhat premature.
        
           | bongodongobob wrote:
           | I didn't think so. I think parent has just misunderstood what
           | chain of thought is and does.
        
         | bob1029 wrote:
         | At no point has any of this been fundamentally more advanced
         | than next token prediction.
         | 
         | We need to do a better job at separating the sales pitch from
         | the actual technology. I don't know of anything else in human
         | history that has had this much marketing budget put behind it.
         | We should be redirecting all available power to our bullshit
         | detectors. Installing new ones. Asking the sales guy if there
         | are any volume discounts.
        
         | meroes wrote:
         | Yep. Chain of thought is just more context disguised as
         | "reasoning". I'm saying this as a RLHF'er going off purely what
         | I see. Never would I say there is reasoning involved. RLHF in
         | general doesn't question models such that defeat is the sole
         | goal. Simulating expected prompts is the game most of the time.
         | So it's just a massive blob of context. A motivated RLHF'er can
         | defeat models all day. Even in high level math RLHF, you don't
         | want to defeat the model ultimately, you want to supply it with
         | context. Context, context, context.
         | 
         | Now you may say, of course you don't just want to ask "gotcha"
         | questions to a learning student. So it'd be unfair to the do
         | that to LLMs. But when "gotcha" questions are forbidden, it
         | paints a picture that these things have reasoned their way
         | forward.
         | 
         | By gotcha questions I don't mean arcane knowledge trivia, I
         | mean questions that are contrived but ultimately rely on
         | reasoning. Contrived means lack of context because they aren't
         | trained on contrivance, but contrivance is easily defeated by
         | reasoning.
        
         | ianbutler wrote:
         | https://www.anthropic.com/research/tracing-thoughts-language...
         | 
         | This article counters a significant portion of what you put
         | forward.
         | 
         | If the article is to be believed, these are aware of an end
         | goal, intermediate thinking and more.
         | 
         | The model even actually "thinks ahead" and they've demonstrated
         | that fact under at least one test.
        
           | Robin_Message wrote:
           | The _weights_ are aware of the end goal etc. But the model
           | does not have access to these weights in a meaningful way in
           | the chain of thought model.
           | 
           | So the model thinks ahead but cannot reason about it's own
           | thinking in a real way. It is rationalizing, not rational.
        
             | Zee2 wrote:
             | I too have no access to the patterns of my neuron's firing
             | - I can only think and observe as the result of them.
        
             | senordevnyc wrote:
             | _So the model thinks ahead but cannot reason about its own
             | thinking in a real way. It is rationalizing, not rational._
             | 
             | My understanding is that we can't either. We essentially
             | make up post-hoc stories to explain our thoughts and
             | decisions.
        
         | tsunamifury wrote:
         | This type of response is from the typical example of an air
         | chair expert that wildly overestimates their own rationalism
         | and deterministic thinking
        
         | jstummbillig wrote:
         | Ah, backseat research engineering by explaining the CoT with
         | the benefit of hindsight. Very meta.
        
         | Timpy wrote:
         | The models outlined in the white paper have a training step
         | that uses reinforcement learning _without human feedback_.
         | They're referring to this as "outcome-based RL". These models
         | (DeepSeek-R1, OpenAI o1/o3, etc) rely on the "chain of thought"
         | process to get a correct answer, then they summarize it so you
         | don't have to read the entire chain of thought. DeepSeek-R1
         | shows the chain of thought and the answer, OpenAI hides the
         | chain of thought and only shows the answer. The paper is
         | measuring how often the summary conflicts with the chain of
         | thought, which is something you wouldn't be able to see if you
         | were using an OpenAI model. As another commenter pointed out,
         | this kind of feels like a jab at OpenAI for hiding the chain of
         | thought.
         | 
         | The "chain of thought" is still just a vector of tokens. RL
         | (without-human-feedback) is capable of generating novel vectors
         | that wouldn't align with anything in its training data. If you
         | train them for too long with RL they eventually learn to game
         | the reward mechanism and the outcome becomes useless. Letting
         | the user see the entire vector of tokens (and not just the
         | tokens that are tagged as summary) will prevent situations
         | where an answer may look or feel right, but it used some
         | nonsense along the way. The article and paper are not asserting
         | that seeing all the tokens will give insight to the internal
         | process of the LLM.
        
         | smallnix wrote:
         | Hm interesting, I don't have direct insight into my brains
         | inner working either. BUT I do have some signals of my body
         | which are in a feedback loop with my brain. Like my heartbeat
         | or me getting sweaty.
        
         | nialv7 wrote:
         | > the model is not aware that it's doing anything so how could
         | it "explain itself"?
         | 
         | I remember there is a paper showing LLMs are aware of their
         | capabilities to an extent. i.e. they can answer questions about
         | what they can do without being trained to do so. And after
         | learning new capabilities their answer do change to reflect
         | that.
         | 
         | I will try to find that paper.
        
           | nialv7 wrote:
           | Found it, here:
           | https://martins1612.github.io/selfaware_paper_betley.pdf
        
         | TeMPOraL wrote:
         | > _They aren 't references to internal concepts, the model is
         | not aware that it's doing anything so how could it "explain
         | itself"?_
         | 
         | I can't believe we're _still_ going over this, few months into
         | 2025. Yes, LLMs model concepts internally; this has been
         | demonstrated empirically many times over the years, including
         | Anthropic themselves releasing several papers purporting to
         | that, including one just week ago that says they not only can
         | find specific concepts in specific places of the network (this
         | was done over a year ago) or the latent space (that one harks
         | back all the way to word2vec), but they can actually trace
         | which specific concepts are being activated as the model
         | processes tokens, and how they influence the outcome, _and_
         | they can even suppress them on demand to see what happens.
         | 
         | State of the art (as of a week ago) is here:
         | https://www.anthropic.com/news/tracing-thoughts-language-mod...
         | - it's worth a read.
         | 
         | > _The words that are coming out of the model are generated to
         | optimize for RLHF and closeness to the training data, that 's
         | it!_
         | 
         | That "optimize" there is load-bearing, it's only missing
         | "just".
         | 
         | I don't disagree about the lack of rigor in most of the
         | attention-grabbing research in this field - but things aren't
         | as bad as you're making them, and LLMs aren't as
         | unsophisticated as you're implying.
         | 
         | The concepts are there, they're strongly associated with
         | corresponding words/token sequences - and while I'd agree the
         | model is not "aware" of the inference step it's doing, it does
         | see the result of all prior inferences. Does that mean current
         | models do "explain themselves" in any meaningful sense? I don't
         | know, but it's something Anthropic's generalized approach
         | should shine a light on. Does that mean LLMs of this kind
         | could, in principle, "explain themselves"? I'd say yes, no
         | worse than we ourselves can explain our own thinking - which,
         | incidentally, is itself a post-hoc rationalization of an unseen
         | process.
        
         | porridgeraisin wrote:
         | > The fact that it was ever seriously entertained that a "chain
         | of thought" was giving some kind of insight into the internal
         | processes of an LLM bespeaks the lack of rigor in this field
         | 
         | This is correct. Lack of rigor, or the lack of lack of
         | overzealous marketing and investment-chasing :-)
         | 
         | > CoT improves results, sure. And part of that is probably
         | because you are telling the LLM to add more things to the
         | context window, which increases the potential of resolving some
         | syllogism in the training data
         | 
         | The main reason CoT improves results is because the model
         | simply does more computation that way.
         | 
         | Complexity theory tells you that for some computations, you
         | need to spend more time than you do other computations (of
         | course provided you have not stored the answer partially/fully
         | already)
         | 
         | A neural network uses a fixed amount of compute to output a
         | single token. Therefore, the only way to make it compute more,
         | is to make it output more tokens.
         | 
         | CoT is just that. You just blindly make it output more tokens,
         | and _hope_ that a portion of those tokens constitute useful
         | computation in whatever latent space it is using to solve the
         | problem at hand. Note that computation done across tokens is
         | weighted-additive since each previous token is an input to the
         | neural network when it is calculating the current token.
         | 
         | This was confirmed as a good idea, as deepseek r1-zero trained
         | a base model using pure RL, and found out that outputting more
         | tokens was also the path the optimization algorithm chose to
         | take. A good sign usually.
        
         | a-dub wrote:
         | it would be interesting to perturb the CoT context window in
         | ways that change the sequences but preserve the meaning mid-
         | inference.
         | 
         | so if you deterministically replay an inference session n times
         | on a single question, and each time in the middle you subtly
         | change the context buffer without changing its meaning, does it
         | impact the likelihood or path of getting to the correct
         | solution in a meaningful way?
        
         | vidarh wrote:
         | It's presumably because a lot of people think what people
         | verbalise - whether in internal or external monologue -
         | actually fully reflects our internal thought processes.
         | 
         | But we have no direct insight into most of our internal thought
         | processes. And we have direct experimental data showing our
         | brain will readily make up bullshit about our internal thought
         | processes (split brain experiments, where one brain half is
         | asked to justify a decision made that it didn't make; it will
         | readily make claims about why it made the decision it didn't
         | make)
        
         | Terr_ wrote:
         | Yeah, I've been beating this drum for a while [0]:
         | 
         | 1. The LLM is a nameless ego-less document-extender.
         | 
         | 2. Humans are reading a _story document_ and seeing words
         | /actions written for _fictional characters_.
         | 
         | 3. We fall for an illusion (esp. since it's an interactive
         | story) and assume the fictional-character and the real-world
         | author are one and the same: "Why did _it_ decide to say that?
         | "
         | 
         | 4. Someone implements "chain of thought" by tweaking the story
         | type so that it is _film noir_. Now the documents have internal
         | dialogue, in the same way they already had spoken lines or
         | actions from before.
         | 
         | 5. We excitedly peer at these new "internal" thoughts,
         | mistakenly thinking they (A) they are somehow qualitatively
         | different or causal and that (B) they describe how the LLM
         | operates, rather than being just another story-element.
         | 
         | [0] https://news.ycombinator.com/item?id=43198727
        
       | nottorp wrote:
       | ... because they don't think.
        
         | rglover wrote:
         | It's deeply frustrating that these companies keep gaslighting
         | people into believing LLMs can think.
        
           | vultour wrote:
           | This entire house of cards is built on people believing that
           | the computer is thinking so it's not going away anytime soon.
        
       | pton_xd wrote:
       | I was under the impression that CoT works because spitting out
       | more tokens = more context = more compute used to "think." Using
       | CoT as a way for LLMs "show their working" never seemed logical,
       | to me. It's just extra synthetic context.
        
         | margalabargala wrote:
         | My understanding of the "purpose" of CoT, is to remove the
         | _wild_ variability yielded by prompt engineering, by
         | "smoothing" out the prompt via the "thinking" output, and using
         | that to give the final answer.
         | 
         | Thus you're more likely to get a standardized answer even if
         | your query was insufficiently/excessively polite.
        
         | voidspark wrote:
         | That's right. It's not "show the working". It's "do more
         | working".
        
         | tasty_freeze wrote:
         | Humans sometimes draw a diagram to help them think about some
         | problem they are trying to solve. The paper contains nothing
         | that the brain didn't already know. However, it is often an
         | effective technique.
         | 
         | Part of that is to keep the most salient details front and
         | center, and part of it is that the brain isn't fully connected,
         | which allows (in this case) the visual system to use its
         | processing abilities to work on a problem from a different
         | angle than keeping all the information in the conceptual
         | domain.
        
         | svachalek wrote:
         | This is an interesting paper, it postulates that the ability of
         | an LLM to perform tasks correlates mostly to the number of
         | layers it has, and that reasoning creates virtual layers in the
         | context space. https://arxiv.org/abs/2412.02975
        
         | ertgbnm wrote:
         | But the model doesn't have an internal state, it just has the
         | tokens, which means it must encode it's reasoning into the
         | output tokens. So it is a reasonable take to think that CoT was
         | them showing their work.
        
       | moralestapia wrote:
       | 40 billion cash to OpenAI while others keep chasing butterflies.
       | 
       | Sad.
        
       | nodja wrote:
       | I highly suspect that CoT tokens are at least partially working
       | as register tokens. Have these big LLM trainers tried replacing
       | CoT with a similar amount of register tokens and see if the
       | improvements are similar?
        
         | wgd wrote:
         | I remember there was a paper a little while back which
         | demonstrated that merely training a model to output "........"
         | (or maybe it was spaces?) while thinking provided a similar
         | improvement in reasoning capability to actual CoT.
        
       | PeterStuer wrote:
       | Humans also post-rationalize the things their subconscious "gut
       | feeling" came up with.
       | 
       | I have no problem for a system to present a _reasonable_ argument
       | leading to a production /solution, even if that _materially_ was
       | not what happened in the generation process.
       | 
       | I'd go even further and pose that probably requiring the
       | "explanation" to be not just congruent but identical with the
       | production would either lead to incomprehensible justifications
       | or severely limited production systems.
        
         | pixl97 wrote:
         | Now, at least in a well disciplined human, we can catch when
         | our gut feeling was wrong when the 'create a reasonable
         | argument' process fails. I guess I wonder how well a LLM can
         | catch that and correct it's thinking.
         | 
         | Now I've seen in some models where it figures out it's wrong,
         | but then gets stuck in a loop. I've not really used the larger
         | reasoning models much to see their behaviors.
        
         | eab- wrote:
         | yep, this post is full of this post-rationalization, for
         | example. it's pretty breathtaking
        
       | alach11 wrote:
       | This is basically a big dunk on OpenAI, right?
       | 
       | OpenAI made a big show out of hiding their reasoning traces and
       | using them for alignment purposes [0]. Anthropic has demonstrated
       | (via their mech interp research) that this isn't a reliable
       | approach for alignment.
       | 
       | [0] https://openai.com/index/chain-of-thought-monitoring/
        
         | gwd wrote:
         | I don't think those are actually showing different things. The
         | OpenAI paper is about the LLM planning to itself to hack
         | something; but when they use training to suppress this
         | "hacking" self-talk, it still hacks the reward function almost
         | as much, it just doesn't use such easily-detectable language.
         | 
         | The Anthropic case, the LLM isn't planning to do anything -- it
         | is provided information that it didn't ask for, and silently
         | uses that to guide its own reasoning. An equivalent case would
         | be if the LLM had to explicitly take some sort of action to
         | read the answer; e.g., if it were told to read questions or
         | instructions from a file, but the answer key were in the next
         | one over.
         | 
         | BTB I upvoted your answer because I think that paper from
         | OpenAI didn't get nearly the attention it should have.
        
       | ctoth wrote:
       | I invite anyone who postulates humans are more than just "spicy
       | autocomplete" to examine this thread. The level of actual
       | reasoning/engaging with the article is ... quite something.
        
         | AgentME wrote:
         | Internet commenters don't "reason". They just generate inane
         | arguments over definitions, like a lowly markov bot, without
         | the true spark of life and soul that even certain large
         | language models have.
        
       | Marazan wrote:
       | You don't say. This is my very shocked face.
        
       | AYHL wrote:
       | To me CoT is nothing but lowering learning rate and increasing
       | iterations in a typical ML model. It's basically to force the
       | model to make a small step at a time and try more times to
       | increase accuracy.
        
       | xg15 wrote:
       | > _There's no specific reason why the reported Chain-of-Thought
       | must accurately reflect the true reasoning process;_
       | 
       | Isn't the whole reason for chain-of-thought that the tokens sort
       | of _are_ the reasoning process?
       | 
       | Yes, there is more internal state in the model's hidden layers
       | while it predicts the next token - but that information is gone
       | at the end of that prediction pass. The information that is kept
       | "between one token and the next" is really only the tokens
       | themselves, right? So in that sense, the OP would be wrong.
       | 
       | Of course we don't know what kind of information the model
       | encodes in the specific token choices - I.e. the tokens might not
       | mean to the model what we think they mean.
        
         | svachalek wrote:
         | Exactly. There's no state outside the context. The difference
         | in performance between the non-reasoning model and the
         | reasoning model comes from the extra tokens in the context. The
         | relationship isn't strictly a logical one, just as it isn't for
         | non-reasoning LLMs, but the process is autoregression and
         | happens in plain sight.
        
         | miven wrote:
         | I'm not sure I understand what you're trying to say here,
         | information between tokens is propagated through self-
         | attention, and there's an attention block inside each
         | transformer block within the model, that's a whole lot of
         | internal state that's stored in (mostly) inscrutable key and
         | value vectors with hundreds of dimensions per attention head,
         | around a few dozen heads per attention block, and around a few
         | dozen blocks per model.
        
           | xg15 wrote:
           | Yes, but all that internal state only survives until the end
           | of the computation chain that predicts the next token - it
           | doesn't survive across the entire sequence as it would in a
           | recurrent network.
           | 
           | There is literally no difference between a model _predicting_
           | the tokens  "<thought> I think the second choice looks best
           | </thought>" and a user putting those tokens into the prompt:
           | The input for the next round would be exactly the same.
           | 
           | So the tokens kind of act like a bottleneck (or more
           | precisely the sampling of exactly _one_ next token at the end
           | of each prediction round does). _During_ prediction of one
           | token, the model can go crazy with hidden state, but not
           | across several tokens. That forces the model to do  "long
           | form" reasoning through the tokens and not through hidden
           | state.
        
             | miven wrote:
             | The key and value vectors are cached, that's kind of the
             | whole point of autoregressive transformer models, the
             | "state" not only survives within the KV cache but, in some
             | sense, grows continuously with each token added, and is
             | reused for each subsequent token.
        
               | xg15 wrote:
               | Hmm, maybe I misunderstood that part, but so far I
               | thought the KV cache was really just that - a cache.
               | Because all the previous tokens of the sequence stay the
               | same, it makes no sense to compute the same K and V
               | vectors again in each round.
               | 
               | But that doesn't change that the only _input_ to the Q, K
               | and V calculations are the tokens (or in later layers
               | information that was derived from the tokens) and each
               | vector in the cache maps directly to an input token.
               | 
               | So I think you could disable the cache and recompute
               | everything in each round and you'd still get the same
               | result, just a lot slower.
        
               | miven wrote:
               | That's absolutely correct, KV cache is just an
               | optimization trick, you could run the model without it,
               | that's how encoder-only transformers do it.
               | 
               | I guess what I'm trying to convey is that the latent
               | representations within a transformer are conditioned on
               | all previous latents through attention, so at least in
               | principle, while the old cache of course does not change,
               | since it grows with new tokens it means that the "state"
               | can be brought up to date by being incorporated in an
               | updated form into subsequent tokens.
        
         | comex wrote:
         | > Of course we don't know what kind of information the model
         | encodes in the specific token choices - I.e. the tokens might
         | not mean to the model what we think they mean.
         | 
         | But it's probably not that mysterious either. Or at least, this
         | test doesn't show it to be so. For example, I doubt that the
         | chain of thought in these examples secretly encodes "I'm going
         | to cheat". It's more that the chain of thought is irrelevant.
         | The model thinks it already knows the correct answer just by
         | looking at the question, so the task shifts to coming up with
         | the best excuse it can think of to reach that answer. But that
         | doesn't say much, one way or the other, about how the model
         | treats the chain of thought when it legitimately is relying on
         | it.
         | 
         | It's like a young human taking a math test where you're told to
         | "show your work". What I remember from high school is that the
         | "work" you're supposed to show has strict formatting
         | requirements, and may require you to use a specific method.
         | Often there are other, easier methods to find the correct
         | answer: for example, visual estimation in a geometry problem,
         | or just using a different algorithm. So in practice you often
         | figure out the answer first and then come up with the
         | justification. As a result, your "work" becomes pretty
         | disconnected from the final answer. If you don't understand the
         | intended method, the "work" might end up being pretty BS while
         | mysteriously still leading to the correct answer.
         | 
         | But that only applies if you know an easier method! If you
         | don't, then the work you show will be, essentially, your actual
         | reasoning process. At most you might neglect to write down
         | auxiliary factors that hint towards or away from a specific
         | answer. If some number seems too large, or too difficult to
         | compute for a test meant to be taken by hand, then you might
         | think you've made a mistake; if an equation turns out to
         | unexpectedly simplify, then you might think you're onto
         | something. You're not supposed to write down that kind of
         | intuition, only concrete algorithmic steps. But the concrete
         | steps are still fundamentally an accurate representation of
         | your thought process.
         | 
         | (Incidentally, if you literally tell a CoT model to solve a
         | math problem, it _is_ allowed to write down those types of
         | auxiliary factors, and probably will. But I 'm treating this
         | more as an analogy for CoT in general.)
         | 
         | Also, a model has a harder time hiding its work than a human
         | taking a math test. In a math test you can write down
         | calculations that don't end up being part of the final shown
         | work. A model can't, so any hidden computations are limited to
         | the ones it can do "in its head". Though admittedly those are
         | very different from what a human can do in their head.
        
         | the_mitsuhiko wrote:
         | > Of course we don't know what kind of information the model
         | encodes in the specific token choices - I.e. the tokens might
         | not mean to the model what we think they mean.
         | 
         | What I think is interesting about this is that for the most
         | part reading the reasoning output is something we can
         | understand. The tokens as produced form english sentences, make
         | intuitive sense. If we think of the reasoning output block as
         | basically just "hidden state" then one could imagine that a
         | there might be a more efficient representation that trades
         | human understanding for just priming the internal state of the
         | model.
         | 
         | In some abstract sense you can already get that by asking the
         | model to operate in different languages. My first experience
         | with reasoning models where you could see the output of the
         | thinking block I think was QwQ which just reasoned in Chinese
         | most of the time, even if the final output was German. Deepseek
         | will sometimes keep reasoning in English even if you ask it
         | German stuff, sometimes it does reason in German. All in all,
         | there might be a more efficient representation of the internal
         | state if one forgoes human readable output.
        
       | lpzimm wrote:
       | Not exactly the same as this study, but I'll ask questions to
       | LLMs with and without subtle hints to see if it changes the
       | answer and it almost always does. For example, paraphrased:
       | 
       | No hint: "I have an otherwise unused variable that I want to use
       | to record things for the debugger, but I find it's often
       | optimized out. How do I prevent this from happening?"
       | 
       | Answer: 1. Mark it as volatile (...)
       | 
       | Hint: "I have an otherwise unused variable that I want to use to
       | record things for the debugger, but I find it's often optimized
       | out. Can I solve this with the volatile keyword or is that a
       | misconception?"
       | 
       | Answer: Using volatile is a common suggestion to prevent
       | optimizations, but it does not guarantee that an unused variable
       | will not be optimized out. Try (...)
       | 
       | This is Claude 3.7 Sonnet.
        
         | pixl97 wrote:
         | I mean, this sounds along the lines of human conversations that
         | go like
         | 
         | P1 "Hey, I'm doing A but X is happening"
         | 
         | P2 "Have you tried doing Y?
         | 
         | P1 "Actually, yea I am doing A.Y and X is still occurring"
         | 
         | P2 "Oh, you have the special case where you need to do A.Z"
         | 
         | What happens when you ask your first question with something
         | like "what is the best practice to prevent this from happening"
        
           | lpzimm wrote:
           | Oh sorry, these are two separate chats, I wasn't clear. I
           | would agree that if I had asked them in the same chat it
           | would sound pretty normal.
           | 
           | When I ask about best practices it does still give me the
           | volatile keyword. (I don't even think that's wrong, when I
           | threw it in Godbolt with -O3 or -Os I couldn't find a
           | compiler that optimized it away.)
        
       | nopelynopington wrote:
       | Of course they don't.
       | 
       | LLMs are a brainless algorithm that guesses the next word. When
       | you ask them what they think they're also guessing the next word.
       | No reason for it to match, except a trick of context
        
       | afro88 wrote:
       | Can a model even know that it used a hint? Or would it only say
       | so if it was trained to say what parts of the context it used
       | when asked? Because then it's statistically probable to say so?
        
       | richardw wrote:
       | One thing I think I've found is: reasoning models get more
       | confident and that makes it harder to dislodge a wrong idea.
       | 
       | It feels like I only have 5% of the control, and then it goes
       | into a self-chat where it thinks it's right and builds on it's
       | misunderstanding. So 95% of the outcome is driven by rambling,
       | not my input.
       | 
       | Windsurf seems to do a good job of regularly injecting guidance
       | so it sticks to what I've said. But I've had some extremely
       | annoying interactions with confident-but-wrong "reasoning"
       | models.
        
       | freehorse wrote:
       | It is nonsense to take whatever an LLM writes in its CoT too
       | seriously. I try to classify some messy data, writing "if X edge
       | case appears, then do Y instead of Z". The model in its CoT took
       | notice of X, wrote it should do Y and... it would not do it in
       | the actual output.
       | 
       | The only way to make actual use of LLMs imo is to treat them as
       | what they are, a model that generates text based on some
       | statistical regularities, without any kind of actual
       | understanding or concepts behind that. If that is understood
       | well, one can know how to setup things in order to optimise for
       | desired output (or "alignment"). The way "alignment research"
       | presents models as if they are _actually_ thinking or have
       | intentions of their own (hence the choice of the word
       | "alignment" for this) makes no sense.
        
       | thoughtlede wrote:
       | It feels to me that the hypothesis of this research was somewhat
       | "begging the question". Reasoning models are trained to spit some
       | tokens out that increase the chance of the models spitting the
       | right answer at the end. That is, the training process is
       | singularly optimizing for the right answer, not the reasoning
       | tokens.
       | 
       | Why would you then assume the reasoning tokens will include hints
       | supplied in the prompt "faithfully"? The model may or may not
       | include the hints - depending on whether the model activations
       | believe those hints are necessary to arrive at the answer. In
       | their experiments, they found between 20% and 40% of the time,
       | the models included those hints. Naively, that sounds
       | unsurprising to me.
       | 
       | Even in the second experiment when they trained the model to use
       | hints, the optimization was around the answer, not the tokens. I
       | am not surprised the models did not include the hints because
       | they are not trained to include the hints.
       | 
       | That said, and in spite of me potentially coming across as an
       | unsurprised-by-the-result reader, it is a good experiment because
       | "now we have some experimental results" to lean into.
       | 
       | Kudos to Anthropic for continuing to study these models.
        
       | m3kw9 wrote:
       | What would "think" mean? Processed the prompt? Or just accessed
       | the part of the model where the weights are? This is a bit
       | persudo science
        
       | islewis wrote:
       | > For the purposes of this experiment, though, we taught the
       | models to reward hack [...] in this case rewarded the models for
       | choosing the wrong answers that accorded with the hints.
       | 
       | > This is concerning because it suggests that, should an AI
       | system find hacks, bugs, or shortcuts in a task, we wouldn't be
       | able to rely on their Chain-of-Thought to check whether they're
       | cheating or genuinely completing the task at hand.
       | 
       | As a non-expert in this field, I fail to see why a RL model
       | taking advantage of it's reward is "concerning". My understanding
       | is that the only difference between a good model and a reward-
       | hacking model is if the end behavior aligns with human preference
       | or not.
       | 
       | The articles TL:DR reads to me as "We trained the model to behave
       | badly, and it then behaved badly". I don't know if i'm missing
       | something, or if calling this concerning might be a little bit
       | sensationalist.
        
       | bee_rider wrote:
       | Chain of thought does have a minor advantage in the final "fish"
       | example--the explanation blatantly contradicts itself to get to
       | the cheated hint answer. A human reading it should be pretty
       | easily able to tell that something fishy is going on...
       | 
       | But, yeah, it is sort of shocking if anybody was using "chain of
       | thought" as a reflection of some actual thought process going on
       | in the model, right? The "thought," such as it is, is happening
       | in the big pile of linear algebra, not the prompt or the
       | intermediary prompts.
       | 
       | Err... anyway, like, IBM was working on explainable AI years ago,
       | and that company is a dinosaur. I'm not up on what companies like
       | OpenAI are doing, but surely they aren't behind IBM in this
       | stuff, right?
        
       | madethisnow wrote:
       | If something convinces you that it's aware then it is. Simulated
       | computation IS computation itself. The territory is the map
        
       | jxjnskkzxxhx wrote:
       | Meh. People also invent justifications after the fact.
        
       | EncomLab wrote:
       | The use of highly anthropomorphic language is always problematic-
       | Does a photo resistor controlled nightlight have a chain of
       | thought? Does it reason about its threshold value? Does it have
       | an internal model of what is light, what is dark, and the role it
       | plays in demarcation between the two?
       | 
       | Are the transistors executing the code within the confines even
       | capable of intentionality? If so - where is it derived from?
        
       | HammadB wrote:
       | There is an abundance of discussion on this thread about whether
       | models are intelligent or not.
       | 
       | This binary is an utter waste of time.
       | 
       | Instead focus on the gradient of intelligence - the set of
       | cognitive skills any given system has and to what degree it has
       | them.
       | 
       | This engineering approach is more likely to lead to practical
       | utility and progress.
       | 
       | The view of intelligence as binary is incredibly corrosive to
       | this field.
        
       ___________________________________________________________________
       (page generated 2025-04-04 23:02 UTC)