[HN Gopher] Principles for Building One-Shot AI Agents
___________________________________________________________________
Principles for Building One-Shot AI Agents
Author : robszumski
Score : 23 points
Date : 2025-04-16 16:30 UTC (2 days ago)
(HTM) web link (edgebit.io)
(TXT) w3m dump (edgebit.io)
| sebastiennight wrote:
| > A different type of hard failure is when we detect that we'll
| never reach our overall goal. This requires a goal that can be
| programmatically verified outside of the LLM.
|
| This is the largest issue : using LLMs as a black box means for
| most goals, we can't rely on them to always "converge to a
| solution" because they might get stuck in a loop trying to figure
| out if they're stuck in a loop.
|
| So then we're back to writing in a hardcoded or deterministic cap
| on how many iterations counts as being "stuck". I'm curious how
| the authors solve this.
| randysalami wrote:
| I think we need quantum systems to ever break out of that
| issue.
| devmor wrote:
| I don't believe quantum computers can solve the halting
| problem, so I don't think that would actually help.
|
| This issue will likely always require a monitor "outside" of
| the agent.
| TZubiri wrote:
| What is a "one-shot" AI Agent? A one-shot AI agent enables
| automated execution of a complex task without a human in the
| loop.
|
| Not at all what one-shot means in the field. Zero-shot, one-shot
| and many-shot means how many examples at inference time are
| needed to perform a task
|
| Zero shot: "convert these files from csv to json"
|
| One shot: "convert from csv to json, like
| "id,name,age/n1,john,20" to {id:"1",name:"tom",age:"20"}
| devmor wrote:
| Given the misunderstandings and explanation of how they
| struggled with a long-solved ml problem, I believe this article
| was likely written by someone without much formal experience in
| AI.
|
| This is probably a case where some educational training could
| have saved the engineer(s) involved a lot of frustration.
___________________________________________________________________
(page generated 2025-04-18 23:00 UTC)