[HN Gopher] Show HN: I built an AI agent that turns ROS 2's turt...
       ___________________________________________________________________
        
       Show HN: I built an AI agent that turns ROS 2's turtlesim into a
       digital artist
        
       I'm a grad student studying robotics, with a particular interest in
       the intersection of LLMs and mobile robots. Recently, I discovered
       how easily LangChain enables the creation of AI agents, and I
       wanted to explore how such agents could interact with simulated
       environments.  So, I built TurtleSim Agent, an AI agent that turns
       the classic ROS 2 turtlesim turtle into a creative artist.  With
       this agent, you can give plain English commands like "draw a
       triangle" or "make a red star," and it will reason through the
       instructions and control the simulated turtle accordingly. I've
       included demo videos on GitHub. Behind the scenes, it uses an LLM
       to interpret the text, decide what actions are needed, and then
       call a set of modular tools (motion, pen control, math, etc.) to
       complete the task.  If you're interested in LLM+robotics, ROS, or
       just want to see a turtle become a digital artist, I'd love for you
       to check it out:  GitHub:
       https://github.com/Yutarop/turtlesim_agent  Looking ahead, I'm also
       exploring frameworks like LangGraph and MCP (Modular Chain of
       Thought Planning) to see whether they might be better suited for
       more complex planning and decision-making tasks in robotics. If
       anyone here is familiar with these frameworks or working in this
       space, I'd love to connect or hear your thoughts.
        
       Author : ponta17
       Score  : 26 points
       Date   : 2025-05-31 10:17 UTC (12 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | dpflan wrote:
       | Forgive me for asking, but im always curios about the definition
       | of "agent". What is an "agent" exactly? Is it a static prompt
       | that is sent along with user input to an LLM service and then
       | handles that resposne? And then it's done? Is an agent a prompted
       | LLM call? Or some entity that is changing its own prompt as it
       | continues to exist?
        
         | karmakaze wrote:
         | It depends on how you look at it. If the output 'it' is a
         | drawing, then the agent is the thing doing the drawing on the
         | user's behalf. In more detail the output thing are commands, so
         | then the agent would be what's generating those commands from
         | the user's input. E.g. a web browser is a user agent that makes
         | requests and renders resources that the user specifies.
        
           | ponta17 wrote:
           | Thanks for the thoughtful question! The term "agent"
           | definitely gets used in a lot of different ways, so I'll
           | clarify what I mean here.
           | 
           | In this project, an agent is an LLM-powered system that takes
           | a high-level user instruction, reasons about what steps are
           | needed to fulfill it, and then executes those steps using a
           | set of tools. So it's more than a single prompted LLM call --
           | the agent maintains a kind of working state and can call
           | external functions iteratively as it plans and acts.
           | 
           | Concretely, in turtlesim_agent, the agent receives an input
           | like "draw a red triangle," and then: 1. Uses the LLM to
           | interpret the intent, 2. Decides which tools to use (like
           | move forward, turn, set pen color), 3. Calls those tools
           | step-by-step until the task is done.
           | 
           | Hope that clears it up a bit!
        
           | paxys wrote:
           | To put it more simply, "agent" is now just a generic term to
           | describe any middleware that sits between user input and a
           | base LLM.
        
       | latchkey wrote:
       | This really brings back memories. The first computer language I
       | learned as a child was Logo. My grandfather gifted me a lesson
       | from a local computer store where someone came out to his house
       | and sat with me in front of his Apple II.
       | 
       | I was too young to understand the concepts around the math of
       | steps or degrees. While the thought of programming on a computer
       | was amazing (and later became an engineer), I couldn't grasp
       | Logo, got frustrated, and lost interest.
       | 
       | If I could have had something like this, I'm sure it would have
       | made more sense to me earlier on. It makes me think about how
       | this will affect the learning rate in a positive way.
        
       | pj_mukh wrote:
       | Haha this is so incredibly cool.
       | 
       | One thing I might've missed, what are the "physics" universe? In
       | the rainbow example the turtle seems to teleport between arcs?
        
       | moffkalast wrote:
       | That's pretty cool, but I feel like all of the LLM integrations
       | with ROS so far have sort of entirely missed the point in terms
       | of useful applications. Endless examples of models sending bare
       | bone twist commands do a disservice to what LLMs are good at,
       | it's like swatting flies with a bazooka in terms of compute used,
       | too.
       | 
       | Getting the robot to move from point A to point B is largely a
       | solved problem with traditional probabilistic methods, while
       | niches where LLMs are the best fit I think are largely still
       | unaddressed, e.g.:
       | 
       | - a pipeline for natural language commands to high level commands
       | ("fetch me a beer" to [send nav2 goal to kitchen, get fridge
       | detection from yolo, open fridge with moveit, detect beer with
       | yolo, etc.]
       | 
       | - using a VLM to add semantic information to map areas, e.g. have
       | the robot turn around 4 times in a room, and have the model
       | determine what's there so it can reference it by location and
       | even know where that kitchen and fridge is in the above example
       | 
       | - system monitoring, where an LLM looks at ros2 doctor, htop,
       | topic hz, etc. and determines if something's crashed or isn't
       | behaving properly, and returns a debug report or attempts to fix
       | it with terminal commands
       | 
       | - handling recovery behaviours in general, since a lot of times
       | when robots get stuck the resolution is simple, you just need
       | something to take in the current situational information, reason
       | about it, and pick one of the possible ways to resolve it
        
       ___________________________________________________________________
       (page generated 2025-05-31 23:01 UTC)