[HN Gopher] OpenAI's new reasoning AI models hallucinate more
       ___________________________________________________________________
        
       OpenAI's new reasoning AI models hallucinate more
        
       Author : almog
       Score  : 5 points
       Date   : 2025-04-18 22:43 UTC (17 minutes ago)
        
 (HTM) web link (techcrunch.com)
 (TXT) w3m dump (techcrunch.com)
        
       | rzz3 wrote:
       | Does anyone have any technical insight on what actually causes
       | the hallucinations? I know it's an ongoing area of research, but
       | do we have a lead?
        
         | pkaye wrote:
         | Anthropic had a recent paper that might be of interest.
         | 
         | https://www.anthropic.com/research/tracing-thoughts-language...
        
         | minimaxir wrote:
         | At a high level, what _causes_ hallucinations is an easier
         | question than how to solve them.
         | 
         | LLMs are pretrained to maximize the probability of the n+1
         | tokens given n tokens. To do this reliably, the model learns
         | statistical patterns in the source data and transformer models
         | are very good at doing that when large enough and given enough
         | data. It is therefore suspect to any statistical biases in the
         | training data because despite many advances in guiding LLMs,
         | e.g. RLHF, LLMs are not sentient and most approaches to get
         | around that such as the current reasoning models are hacks over
         | a fundamental problem with the approach.
         | 
         | It also doesn't help that when sampling the tokens, the default
         | temperature of most LLM UIs is 1.0, with the argument that it
         | is better for creativity. If you have access to the API and
         | want a specific answer more reliably, I recommend setting
         | temperature = 0.0, in which case the model will always select
         | the token with the highest probability and tends to be more
         | correct.
        
         | vikramkr wrote:
         | There's the anthropic paper someone else linked, but also it's
         | pretty interesting to see the question framed as trying to
         | understand what causes the hallucinations lol. It's a (very
         | fancy) next word predictor - it's kind of amazing that it
         | doesn't hallucinate! Like that paper showed that there were
         | circuits that functionally actually do things resembling
         | arithmetic and computation with lookup tables instead of just
         | blindly 'guessing' a random number when asked what an
         | arithmetic expression equals and that seems like the much more
         | extraordinary thing that we want to figure out the cause of!
        
       | serjester wrote:
       | Anecdotally o3 is the first OpenAI model in a while that I have
       | to double check if it's dropping important pieces of my code.
        
       ___________________________________________________________________
       (page generated 2025-04-18 23:01 UTC)