[HN Gopher] "Self-reflecting" AI agents explore like animals
___________________________________________________________________
"Self-reflecting" AI agents explore like animals
Author : chdoyle
Score : 58 points
Date : 2023-07-06 21:00 UTC (1 hours ago)
(HTM) web link (hai.stanford.edu)
(TXT) w3m dump (hai.stanford.edu)
| ftxbro wrote:
| So from this hacker news title I definitely thought it was saying
| that when you give some AI agents a self reflection like maybe by
| putting an internal monologue loop then they unlock an emergent
| animal-like exploration behavior.
|
| But this is not what happened. Instead, some guys told AI agents
| to explore in the way that the guys think that animals explore.
| "Stanford researchers invented the "curious replay" training
| method based on studying mice to help AI agents"
| onetokeoverthe wrote:
| [dead]
| piyh wrote:
| Direct arxiv link: https://arxiv.org/pdf/2306.15934.pdf
| FrustratedMonky wrote:
| Exactly. We keep leaving out 'motivation' on these models. Since
| they are reacting to prompts. But put them on a loop with goals
| and see what happens.
|
| And, things like GPT are not 'embodied', since they don't live in
| the 'world' they can't associate language with physical reality.
| Put them in a simulated environment like a game, and it looks a
| lot more 'conscious'.
| jjtheblunt wrote:
| it's kind of interesting how increasingly frequently
| "stanford.edu" is finding its way into HN submissions, and did
| the increasing frequency start with the GPT-4 enthusiasm?
|
| is that coincidence?
| xianshou wrote:
| The result is mildly interesting - improvement on an isolated
| task but none on the full benchmark - but what would be much more
| compelling is curiosity-driven replay in an LLM context combined
| with chain- or tree-of-thought techniques. This would be the
| machine analogy to noticing your confusion, a sort of "what do I
| need to know" or "what am I overlooking"? Anecdotally, language
| models perform better when you prompt them to ask their own
| questions in the process of answering yours, so I would expect
| curiosity to have a meaningful impact.
___________________________________________________________________
(page generated 2023-07-06 23:00 UTC)