[HN Gopher] Deductive Verification for Chain-of-Thought Reasonin...
___________________________________________________________________
Deductive Verification for Chain-of-Thought Reasoning in LLMs
Author : smooke
Score : 36 points
Date : 2024-09-10 15:29 UTC (7 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| YeGoblynQueenne wrote:
| Chain of Thought prompting reminds me of Facilitated
| Communication:
|
| https://en.wikipedia.org/wiki/Facilitated_communication
|
| A long discredited intervention where a "facilitator" guides the
| hand of a non-verbal human to help them write down their thoughts
| and experiences. Experiments that blinded the facilitator to the
| observations of the subject, where the written message matched
| the facilitator's, rather than the subject's, observations, have
| convincingly proved that it was so much bunkum. It's the Clever
| Hans Effect all by another name, and with non-verbal humans
| rather than horses.
|
| Chain of Thought works like that: without hand-holding by a human
| who understands how to answer a question, the LLM's performance
| drops, or drops off a cliff even. Of course this is much harder
| to prove for LLMs than it was for facilitated communication
| because LLMs don't really do anything without a prompt in the
| first place. Which should be a very big hint of what's really
| going on with CoT.
| Lerc wrote:
| Sometimes it looks like the computationalists are trying to sneak
| back into the room while no-one is looking.
|
| There does seem to be quite a lot of independent ad-hoc efforts
| making custom notations for C-O-T. I feel like we're in a period
| similar to just after the first programming languages and
| compilers were invented but regular expressions were yet to come.
| In a way that's quite exciting, its another little Cambrian
| explosion.
|
| I don't think it will be a panacea though. In my observations of
| failures of reasoning in LLMs, a lot of the problem isn't that
| they fail to follow logical steps but that they fail to notice
| the presence of implied premises completely. Chain of Thought is
| good for spotting the wrong reasoning, but not for spotting that
| the problem is not the one that it appears at first glance.
| mdp2021 wrote:
| > _the problem isn 't that they fail to follow logical steps
| but that they fail to notice the presence of implied premises
| completely_
|
| Could you please explain more clearly what you noticed? Can you
| find an example?
| Lerc wrote:
| One example is the three killers problem
|
| There are three killers in a room, someone enters the room
| and kills one of them, how many killers are now in the room?
|
| Apart from the basic reasoning errors of small models (where
| they come up with all sorts of numbers), larger models that
| fail, do so because they fail to take into account that
| killing makes a killer. Some models succeed at this but it is
| unclear if this is because they are familiar with this
| specific problem.
|
| The model has to make the distinction between that and
|
| There are three fish in a room, someone enters the room and
| kills one of them, how many fish are now in the room?
|
| or
|
| There are three fish in a room, someone enters the room and
| kills one of them, how many killers are now in the room?
|
| In these examples the word kills is pushed in one direction
| or another, but when it serves double duty in the original
| problem it can notice one premise and miss the other.
|
| It's interesting task in frustration to try and get a model
| that gets it wrong to COT out of that particular error.
| svnt wrote:
| An example is where the link between the ideas is not present
| in the prompt.
|
| E.g. an oversimple example is:
|
| It's going to rain tomorrow so you should take an umbrella.
|
| If someone doesn't understand the relationship between an
| umbrella and keeping dry -- the implied premise -- they won't
| understand why they should bring one and the statement would
| be puzzling.
|
| We talk about this often as someone or something "not knowing
| what an x is" where in this case "knowing" means in a spatial
| or logical sense, not just a linguistic sense.
|
| LLMs do not have this layer of understanding but they can
| fake it in short interactions.
___________________________________________________________________
(page generated 2024-09-10 23:01 UTC)