hngopher.com

       [HN Gopher] Deductive Verification for Chain-of-Thought Reasonin...
       ___________________________________________________________________
        
       Deductive Verification for Chain-of-Thought Reasoning in LLMs
        
       Author : smooke
       Score  : 36 points
       Date   : 2024-09-10 15:29 UTC (7 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | YeGoblynQueenne wrote:
       | Chain of Thought prompting reminds me of Facilitated
       | Communication:
       | 
       | https://en.wikipedia.org/wiki/Facilitated_communication
       | 
       | A long discredited intervention where a "facilitator" guides the
       | hand of a non-verbal human to help them write down their thoughts
       | and experiences. Experiments that blinded the facilitator to the
       | observations of the subject, where the written message matched
       | the facilitator's, rather than the subject's, observations, have
       | convincingly proved that it was so much bunkum. It's the Clever
       | Hans Effect all by another name, and with non-verbal humans
       | rather than horses.
       | 
       | Chain of Thought works like that: without hand-holding by a human
       | who understands how to answer a question, the LLM's performance
       | drops, or drops off a cliff even. Of course this is much harder
       | to prove for LLMs than it was for facilitated communication
       | because LLMs don't really do anything without a prompt in the
       | first place. Which should be a very big hint of what's really
       | going on with CoT.
        
       | Lerc wrote:
       | Sometimes it looks like the computationalists are trying to sneak
       | back into the room while no-one is looking.
       | 
       | There does seem to be quite a lot of independent ad-hoc efforts
       | making custom notations for C-O-T. I feel like we're in a period
       | similar to just after the first programming languages and
       | compilers were invented but regular expressions were yet to come.
       | In a way that's quite exciting, its another little Cambrian
       | explosion.
       | 
       | I don't think it will be a panacea though. In my observations of
       | failures of reasoning in LLMs, a lot of the problem isn't that
       | they fail to follow logical steps but that they fail to notice
       | the presence of implied premises completely. Chain of Thought is
       | good for spotting the wrong reasoning, but not for spotting that
       | the problem is not the one that it appears at first glance.
        
         | mdp2021 wrote:
         | > _the problem isn 't that they fail to follow logical steps
         | but that they fail to notice the presence of implied premises
         | completely_
         | 
         | Could you please explain more clearly what you noticed? Can you
         | find an example?
        
           | Lerc wrote:
           | One example is the three killers problem
           | 
           | There are three killers in a room, someone enters the room
           | and kills one of them, how many killers are now in the room?
           | 
           | Apart from the basic reasoning errors of small models (where
           | they come up with all sorts of numbers), larger models that
           | fail, do so because they fail to take into account that
           | killing makes a killer. Some models succeed at this but it is
           | unclear if this is because they are familiar with this
           | specific problem.
           | 
           | The model has to make the distinction between that and
           | 
           | There are three fish in a room, someone enters the room and
           | kills one of them, how many fish are now in the room?
           | 
           | or
           | 
           | There are three fish in a room, someone enters the room and
           | kills one of them, how many killers are now in the room?
           | 
           | In these examples the word kills is pushed in one direction
           | or another, but when it serves double duty in the original
           | problem it can notice one premise and miss the other.
           | 
           | It's interesting task in frustration to try and get a model
           | that gets it wrong to COT out of that particular error.
        
           | svnt wrote:
           | An example is where the link between the ideas is not present
           | in the prompt.
           | 
           | E.g. an oversimple example is:
           | 
           | It's going to rain tomorrow so you should take an umbrella.
           | 
           | If someone doesn't understand the relationship between an
           | umbrella and keeping dry -- the implied premise -- they won't
           | understand why they should bring one and the statement would
           | be puzzling.
           | 
           | We talk about this often as someone or something "not knowing
           | what an x is" where in this case "knowing" means in a spatial
           | or logical sense, not just a linguistic sense.
           | 
           | LLMs do not have this layer of understanding but they can
           | fake it in short interactions.
        
       ___________________________________________________________________
       (page generated 2024-09-10 23:01 UTC)