hngopher.com

       [HN Gopher] New tools for building agents
       ___________________________________________________________________
        
       New tools for building agents
        
       Author : meetpateltech
       Score  : 219 points
       Date   : 2025-03-11 17:04 UTC (5 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | nnurmanov wrote:
       | Does anyone know if there are any difference if you typed the
       | question with typos vs you did it correctly?
        
         | davidbarker wrote:
         | In theory there shouldn't be -- LLMs are pretty robust to typos
         | and usually infer the intended meaning regardless.
        
       | swyx wrote:
       | swyx here. we got some preview and time with the API/DX team to
       | ask FAQs about all the new APIs.
       | 
       | https://latent.space/p/openai-agents-platform
       | 
       | main fun part - since responses are stored for free by default
       | now, how can we abuse the Responses API as a database :)
       | 
       | other fun qtns that a HN crew might enjoy:
       | 
       | - hparams for websearch - depth/breadth of search for making your
       | own DIY Deep Research
       | 
       | - now that OAI is offering RAG/reranking out of the box as part
       | of the Responses API, when should you build your own RAG? (i
       | basically think somebody needs to benchmark the RAG capabilities
       | of the Files API now, because the community impression has not
       | really updated from back when Assistants API was first launched)
       | 
       | - whats the diff between Agents SDK and OAI Swarm? (basically
       | types, tracing, pluggable LLMs)
       | 
       | - will the `search-preview` and `computer-use-preview` finetunes
       | be merged into GPT5?
        
         | ggnore7452 wrote:
         | appreciate the question on hparams for websearch!
         | 
         | one of the main reasons i build these ai search tools from
         | scratch is that i can fully control the depth and breadth (and
         | also customize loader to whatever data/sites). and currently
         | the web search isn't very transparent on what sites they do not
         | have full text or just use snippets.
         | 
         | having computer use + websearch is definitely something very
         | powerful (openai's deep research essentially)
        
         | mritchie712 wrote:
         | for anyone that likes the Agents SDK, but doesn't want their
         | framework attached to OpenAI, we're really liking
         | PydanticAI[0].
         | 
         | 0 - https://ai.pydantic.dev/
        
           | fullstackwife wrote:
           | Openai SDK docs:
           | 
           | > Notably, our SDK is compatible with any model providers
           | that support the OpenAI Chat Completions API format.
           | 
           | so you can use with everything, not only OpenAI?
        
             | DrBenCarson wrote:
             | Yes
        
             | swyx wrote:
             | yea they mention this on the pod
        
           | darkteflon wrote:
           | There's also HF's smolagents[1].
           | 
           | 1 - https://github.com/huggingface/smolagents
        
         | suttontom wrote:
         | What is a "qtns"?
        
           | oofbaroomf wrote:
           | Questions.
        
       | baxtr wrote:
       | A bit off topic but the post comes handy: can we settle the
       | debate what an agent really is? It seems like everyone has their
       | own definition.
       | 
       | Ok I'll start: an agent is a computer program that utilized LLMs
       | heutiger for decision making.
        
         | codydkdc wrote:
         | an agent is software that does something on behalf of someone
         | (aka software)
         | 
         | I personally strongly prefer the term "bots" for what most of
         | these frameworks call "agents"
        
           | handfuloflight wrote:
           | Stick to the agentic nomenclature if you want at least a
           | magnitude increase in valuation.
        
         | 3stripe wrote:
         | First rule of writing definitions: use everyday English.
        
           | baxtr wrote:
           | True! Meant heuristic
        
         | knowaveragejoe wrote:
         | I think Anthropic's definition makes the most sense.
         | 
         | - Workflows are systems where LLMs and tools are orchestrated
         | through predefined code paths. (imo this is what most people
         | are referring to as "agents")
         | 
         | - Agents, on the other hand, are systems where LLMs dynamically
         | direct their own processes and tool usage, maintaining control
         | over how they accomplish tasks.
         | 
         | https://www.anthropic.com/engineering/building-effective-age...
        
           | kodablah wrote:
           | The problem with this definition is that modern workflow
           | systems are not through predefined code paths, they do
           | dynamically direct their own processes and tool usage.
        
         | rglover wrote:
         | Agents are just regular LLM chat bots that are prompted to
         | parse user input into instructions about what functions to call
         | in your back-end, with what data, etc. Basically it's a way to
         | take random user input and turn it into pseudo-logic you can
         | write code against.
         | 
         | As an example, I can provide a system prompt that mentions a
         | function like get_weather() being available to call. Then, I
         | can pass whatever my user's prompt text is and the LLM will
         | determine what code I need to call on the back-end.
         | 
         | So if a user types "What is the weather in Nashville?" the LLM
         | would infer that the user is asking about weather and reply to
         | me with a string like "call function get_weather with location
         | Nashville" or if you prompted it, some JSON like {
         | function_to_call: 'get_weather', location: 'Nashville' }. From
         | there, I'd just call that function with any the data I asked
         | the LLM to provide.
        
         | kylecazar wrote:
         | Even more off topic, does "heutiger" mean something in English
         | that I'm unaware of? Google tells me it's just German for
         | 'today' or 'current'.
        
           | baxtr wrote:
           | Never heard that word either!
        
       | zellyn wrote:
       | Notably not mentioned: Model Context Protocol
       | https://www.anthropic.com/news/model-context-protocol
        
         | nilslice wrote:
         | not implementing doesn't mean its not supported
         | https://github.com/dylibso/mcpx-openai-node (this is for
         | mcp.run tool calling with OpenAI models, not generic)
         | 
         | but yes, it's the strongest anti-developer move to not directly
         | support MCP. not surprised given OpenAI generally. but would be
         | a very nice addition!
        
           | benatkin wrote:
           | DeepSeek doesn't seem to support it either FWIW. Maybe MCP is
           | just an Anthropic thing.
        
         | esafak wrote:
         | How do they compare?
        
           | cowpig wrote:
           | MCP is a protocol, and Anthropic has provided SDKs for
           | implementing that protocol. In practice, I find the MCP
           | protocol to be pretty great, but it leaves basically
           | everything _except_ the model parts out. I.e. MCP really only
           | addresses how  "agentic" systems interact with one another,
           | nothing else.
           | 
           | This SDK is trying to provide a bunch of code for
           | implementing specific agent codebases. There are a bunch of
           | open source ones already, so this is OpenAI throwing their
           | hat in the ring.
           | 
           | IMO this OpenAI release is kind of ecosystem-hostile in that
           | they are directly competing with their users, in the same way
           | that the GPT apps were.
        
             | esafak wrote:
             | Thank you. Which open source ones are best?
        
         | knowaveragejoe wrote:
         | You can (somewhat) bridge between them:
         | 
         | https://github.com/SecretiveShell/MCP-Bridge
        
         | dgellow wrote:
         | Do you have experience with MCP? If yes, what do you think of
         | it?
        
         | thenameless7741 wrote:
         | it's mentioned in the main thread:
         | https://nitter.net/athyuttamre/status/1899511569274347908
         | 
         | > [Q] Does the Agents SDK support MCP connections? So can we
         | easily give certain agents tools via MCP client server
         | connections?
         | 
         | > [A] You're able to define any tools you want, so you could
         | implement MCP tools via function calling
        
       | rvz wrote:
       | They did not announce the price(s) in the presentation. Likely
       | because they know it is going to be very expensive:
       | Web Search [0]         * $30 and $25 per 1K queries for GPT-4o
       | search and 4o-mini search.             File search [1]         *
       | $2.50 per 1K queries and file storage at $0.10/GB/day         *
       | First 1GB is free.             Computer use tool (computer-use-
       | preview model) [2]         * $3 per 1M input tokens and $12/1M
       | output tokens.
       | 
       | [0] https://platform.openai.com/docs/pricing#web-search
       | 
       | [1] https://platform.openai.com/docs/pricing#built-in-tools
       | 
       | [2] https://platform.openai.com/docs/pricing#latest-models
        
         | yard2010 wrote:
         | So they're basically pivoting from selling text by the ounce to
         | selling web searches and cloud storage? I like it, it's a bold
         | move. When the slow people at Google finally catch up it might
         | be too late for Google?
        
           | KoolKat23 wrote:
           | Google AI Studios "Grounding" basically web search is priced
           | similarly. (Very expensive for either, although Google gives
           | you your first 1500 queries free).
           | 
           | It seems completely upside down, they always said traditional
           | search was cheaper/less intensive, I guess a lot of tokens
           | must go into the actual LLM searching and retrieving.
        
       | Areibman wrote:
       | Nice to finally see one of the labs throwing weight behind a much
       | needed simple abstraction. It's clear they learned from the
       | incumbents (langchain et al)-- don't sell complexity.
       | 
       | Also very nice of them to include extensible tracing. The
       | AgentOps integration is a nice touch to getting behind the scenes
       | to understand how handoffs and tool calls are triggered
        
         | esafak wrote:
         | Extensible how?
        
         | swyx wrote:
         | why agentops specifically? doesnt the oai first party one also
         | do it?
        
         | bloomingkales wrote:
         | Langchain felt like a framework that was designed to allow
         | people to sell it on their resumes. So many ideas, it would
         | easily take up one full line of a resume. I think it's super
         | important not to let frameworks like that become incumbent
         | right now, especially when everyone is an exploration state.
        
       | serjester wrote:
       | This is one of the few agent abstractions I've seen that actually
       | seems intuitive. Props to the OpenAI team, seems like it'll kill
       | a lot of bad startups.
        
         | sdcoffey wrote:
         | Steve here from the OpenAI team-this means a lot! We really
         | hope you enjoy building on it
        
       | ilaksh wrote:
       | The Agents SDK they linked to comes up 404.
       | 
       | BTW I have something somewhat similar to some of this like
       | Responses and File Search in MindRoot by using the task API:
       | https://github.com/runvnc/mindroot/blob/main/api.md
       | 
       | Which could be combined with the query_kb tool from the mr_kb
       | plugin (in my mr_kb repo) which is actually probably better than
       | File Search because it allows searching multiple KBs.
       | 
       | Anyway, if anyone wants to help with my program, create a plugin
       | on PR, or anything, feel free to connect on GitHub, email or
       | Discord/Telegram (runvnc).
        
         | yablak wrote:
         | Loads fine for me. Maybe because I'm logged in?
        
           | IncreasePosts wrote:
           | That should be a 403 then. Tsk tsk open ai
        
             | 29ebJCyy wrote:
             | Technically it should be a 401. Tsk tsk IncreasePosts.
        
               | __float wrote:
               | It's common (see: S3, private GitHub repos) to return 404
               | instead of unauthorized to avoid even leaking existence
               | of a resource at URL.
        
       | anorak27 wrote:
       | I have built myself a much simpler and powerful version of the
       | responses API and it works with all LLM providers.
       | 
       | https://github.com/Anilturaga/aiide
        
       | nextworddev wrote:
       | This may be bad for Langflow, Langsmith, etc
        
       | nowittyusername wrote:
       | How does this compare to MCP? Anyone has any considerations on
       | the matter?
        
       | mentalgear wrote:
       | Well, I'll just wait 2-3 days until a (better) open-source
       | alternative is released. :D
        
       | jumploops wrote:
       | > "we plan to formally announce the deprecation of the Assistants
       | API with a target sunset date in mid-2026."
       | 
       | The new Responses API is a step in the right direction,
       | especially with the built-in "handoff" functionality.
       | 
       | For agentic use cases, the new API still feels a bit limited, as
       | there's a lack of formal "guardrails"/state machine logic built
       | in.
       | 
       | > "Our goal is to give developers a seamless platform experience
       | for building agents"
       | 
       | It will be interesting to see how they move towards this
       | platform, my guess is that we'll see a graph-based control flow
       | in the coming months.
       | 
       | Now there are countless open-source solutions for this, but most
       | of them fall short and/or add unnecessary obfuscation/complexity.
       | 
       | We've been able to build our agentic flows using a combination of
       | tool calling and JSON responses, but there's still a missing
       | higher order component that no one seems to have cracked yet.
        
       | hodanli wrote:
       | I wonder why they phased out Pydantic in structured output for
       | the Responses API.
        
         | sdcoffey wrote:
         | Hey there! This is Steve here from the OpenAI team-I worked on
         | the Responses API. We have not removed this! It should still
         | work just like before! Here's an example:
         | 
         | https://github.com/openai/openai-python/blob/main/examples/r...
        
         | lunarcave wrote:
         | (Shameless plug) I worked on something for anyone else wanting
         | to get structured outputs from LLMs in a model agnostic way
         | (Including Open AI models): https://github.com/inferablehq/l1m
        
       | phren0logy wrote:
       | I'm a bit surprised at the approach to RAG. It will be great to
       | see how well it handles complex PDFs. The max size is _far_
       | larger than the Anthropic API permits (though that 's obviously
       | very different - no RAG).
       | 
       | The chunking strategy is... pretty basic, but I guess we'll see
       | if it works well enough for enough people.
        
       | cosbgn wrote:
       | We handle over 1M requests per month using the Assistant API on
       | https://rispose.com which apparently will get depreciated mid
       | 2026. Should we move to the new API?
        
         | jstummbillig wrote:
         | Eventually, yes. The addressed Assistant API near the end of
         | the the video: They say there will be a transition path, once
         | they built all Assistant features into the new API, and ample
         | time to take action.
        
         | nknj wrote:
         | there's no rush to do this - in the coming weeks, we will add
         | support for:
         | 
         | - assistant-like and thread-like objects to the responses api
         | 
         | - async responses
         | 
         | - code interpreter in responses
         | 
         | once we do this, we'll share a migration guide that allows you
         | to move over without any loss of features or data. we'll also
         | give you a full 12 months to do your migration. feel free to
         | reach out at nikunj[at]openai.com if you have any questions
         | about any of this, and thank you so much for building on the
         | assistants api beta! I think you'll really like responses api
         | too!
        
           | marko-k wrote:
           | If Responses is replacing Assistants, is there a quickstart
           | template available--similar to the one you had for
           | Assistants?
           | 
           | https://github.com/openai/openai-assistants-quickstart
        
       | dmayle wrote:
       | Is it just me, or is what OpenAI is really lacking is a billing
       | API/platform?
       | 
       | As an engineer, I have to manage the cost/service ratio manually,
       | making sure I charge enough to handle my traffic, while
       | enforcing/managing/policing the usage.
       | 
       | Additionally, there are customers who already pay for OpenAI, so
       | the value add for them is less, since they are paying twice for
       | the underlying capabilities.
       | 
       | If OpenAPI had a billing API/platform ala AppStore/PlayStore, I
       | have multiple price points matched to OpenAI usage limits (and
       | maybe configurable profit margins).
       | 
       | For customers that don't have an existing relationship with me,
       | OpenAI could support a Netflix/YouTube-style profit-sharing
       | system, where OpenAI customers can try out and use products
       | integrated with the billing platform/API, and my products would
       | receive payment in accordance with customer usage...
        
         | mrcwinn wrote:
         | One, if you charge above API costs, you should never police
         | usage (so long as you're transparent with customers). Why would
         | you need to cap usage if you're pricing correctly? (Rate limits
         | aside)
         | 
         | Two, yes, many people will pay $20/mo for ChatGPT and then also
         | pay for a product that under the hood uses OpenAI API. If
         | you're worried about your product's value not being
         | differentiated from ChatGPT, I'd say you have a product problem
         | moreso than OpenAI has a billing model problem.
        
         | bloomingkales wrote:
         | We need a subreddit on how everyone is managing token pricing.
        
       | falcor84 wrote:
       | I'm impressed by the advances in Computer Use mentioned here and
       | this got me wondering - is this already mature enough to be
       | utilized for usability testing? Would I be right to assume that
       | in general, a UI that is more difficult for AI to navigate is
       | likely to also be relatively difficult for humans, and that it's
       | a signal that it should be simplified/improved in some way?
        
         | m3t4man wrote:
         | Why would you assume that? Modality of engagement is
         | drastically different between the way LLM engages with UI vs
         | human being
        
       | daviding wrote:
       | It would have been nice if the Completions use of the internal
       | web-search tool wasn't always mandatory and could be set to
       | 'auto'. Would save a lot of reworking just to go the new
       | Responses API format just for that use case.
        
       | theuppermiddle wrote:
       | Does the SDK allow executing Python code generated in some sort
       | of sandbox? If not are there any open source library which does
       | this for us? I would ideally like the state of the code executed,
       | including return values, available for the entire chat session,
       | like IPython, so that subsequent LLM generated code can use them.
        
         | sci_prog wrote:
         | Yeah, OpenInterpreter does this (you are not limited to OpenAI
         | only): https://github.com/OpenInterpreter/open-interpreter
         | 
         | I wrote a wrapper around it that works in a web browser (you'll
         | need an OpenAI API key):
         | https://github.com/uhsealevelcenter/IDEA
        
       | nekitamo wrote:
       | Does the new Agents SDK support streaming audio and Realtime
       | models?
        
       | simonw wrote:
       | There's a really good thread on Twitter from the designer of the
       | new APIs going into the background behind many of the design
       | decisions:
       | https://twitter.com/athyuttamre/status/1899541471532867821
       | 
       | Here's the alternative link for people who aren't signed in to
       | Twitter:
       | https://nitter.net/athyuttamre/status/1899541471532867821
        
         | bradyriddle wrote:
         | The nitter link is appreciated!
        
       | cowpig wrote:
       | Feels like OpenAI really want to compete with its own ecosystem.
       | I guess they are doing this to try to position themselves as the
       | standard web index that everyone uses, and the standard RAG
       | service, etc.
       | 
       | But they could just make great services and live in the infra
       | layer instead of trying to squeeze everyone out at the
       | application layer. Seems unnecessarily ecosystem-hostile
        
       | andrethegiant wrote:
       | $25 per thousand searches seems excessive
        
       | simonw wrote:
       | If you want to get an idea for the changes, here's a giant commit
       | where they updated ALL of the Python library examples in one go
       | from the old chat completions to the new resources APIs:
       | https://github.com/openai/openai-python/commit/2954945ecc185...
        
       ___________________________________________________________________
       (page generated 2025-03-11 23:00 UTC)