[HN Gopher] Show HN: Superglue - open source API connector that ...
       ___________________________________________________________________
        
       Show HN: Superglue - open source API connector that writes its own
       code
        
       Hi HN, we're Stefan and Adina, and we're building superglue
       (https://superglue.cloud). superglue allows you to connect to any
       API/data source and get the data you want in the format you need.
       It's an open-source proxy server which sits between you and your
       target APIs. Thus, you can easily deploy it into your own infra.
       If you're spending a lot of time writing code connecting to weird
       APIs, fumbling with custom fields in foreign language ERPs, mapping
       JSONs, extracting data from compressed CSVs sitting on FTP servers,
       and making sure your integrations don't break when something
       unexpected comes through, superglue might be for you.  Here's how
       it works: You define your desired data schema and provide basic
       instructions about an API endpoint (like "get all issues from
       Jira"). superglue then does the following:  - Automatically
       generates the API configuration by analyzing API docs.  - Handles
       pagination, authentication, and error retries.  - Transforms
       response data into the exact schema you want using JSONata
       expressions.  - Validates that all data coming through follows that
       schema, and fixes transformations when they break.  We built this
       after noticing how much of our team's time was spent building and
       maintaining data integration code. Our approach is a bit different
       to other solutions out there because we (1) use LLMs to generate
       mapping code, so you can basically build your own universal API
       with the exact fields that you need, and (2) validate that what you
       get is what you're supposed to get, with the ability to "self-heal"
       if anything goes wrong.  You can run superglue yourself
       (https://github.com/superglue-ai/superglue - license is GPL), or
       you can use our hosted version (https://app.superglue.cloud) and
       our TS SDK (npm i @superglue/client).  Here's a quick demo:
       https://www.youtube.com/watch?v=A1gv6P-fas4 You can also try out
       Jira and Shopify demos on our website (https://superglue.cloud)
       Excited to share superglue with everyone here--it's early so you'll
       probably find bugs, but we'd love to get your thoughts and see if
       others find this approach useful!
        
       Author : adinagoerres
       Score  : 98 points
       Date   : 2025-02-27 17:20 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | hoerzu wrote:
       | Love it, is there also a possibility for alarms if schema
       | changes?
        
         | sfaist wrote:
         | working on it... ping me if you have a usecase in mind and I
         | can set it up for you.
        
       | a-dub wrote:
       | this is VERY cool!
        
         | adinagoerres wrote:
         | thank you!
        
       | m0rde wrote:
       | Great idea, congrats. Can you speak a bit about the the
       | validation piece? Were LLM hallucinations an issue and required
       | this? Are you using some kind of structured output feature?
        
         | sfaist wrote:
         | Sure! We use structured output for the endpoint, but not for
         | the jsonata since it's hard to actually describe as a format. 3
         | big levers for accuracy / reducing hallucinations: 1. direct
         | validation: we apply the jsonata that is generated and check if
         | it really produces what we want (we have the schema after all).
         | This way we can catch errors as they come up. 2. using a
         | reasoning model: by switching to o3-mini, we were able to
         | drastically improve the correctness of the jsonata. takes a bit
         | longer, but better waiting a bit than incorrect mappings. 3.
         | using a confidence score: still in development, but sometimes
         | there are multiple options to map something (e.g. 3 types of
         | prices in the source, but you only want one. Which one?). So
         | we're working on showing the user how "certain" we are that a
         | mapping is correct.
        
       | DaiPlusPlus wrote:
       | > Automatically generates the API configuration by analyzing API
       | docs.
       | 
       | The problem with a lot (most?) integration work is that often
       | there simply aren't any API docs - or the docs are
       | outdated/obsolete (because they were written by-hand in an MS
       | Word doc and never kept up-to-date) - or sometimes there isn't an
       | API in the first place (c.f. screen-scraping, but also
       | exfiltration via other means). Are these scenarios you expect or
       | hope to accommodate?
        
         | sfaist wrote:
         | you can give it any context you have, worst case in text form,
         | and the llm will try to figure it out, call different endpoints
         | etc. Recently someone mentioned to me the intern test by Hamel
         | Husain: if avg college student can suceed with the given input
         | (with a lot of trying and time), then llms should be able to do
         | it too. So that's the bar we're aiming for.
         | 
         | No api at all is out of scope for now, there are other tools
         | that are better suited for that.
        
       | npollock wrote:
       | something like this that runs as a browser agent, allowing me to
       | extract structured data from websites (whitelisted) using natural
       | language queries
        
         | adinagoerres wrote:
         | huh interesting. we're exploring extraction from html
        
       | promocha wrote:
       | Really nice idea and product. Does it update and cache changed
       | schema for the target API? For ex. an app makes frequent get
       | calls to retrieve list of houses but API changed with new schema,
       | would Superglue figure it out at runtime or is it updating schema
       | regularly for target API based on their API docs (assuming they
       | have it)?
        
         | sfaist wrote:
         | Yes, it does update and cache changed schema for the target
         | API. At runtime. The way it works that every time you make a
         | call to superglue, we get the data from the source and apply
         | the jsonata (that's very fast). We then validate the result
         | against the json schema that you gave us. If it doesn't match,
         | e.g. because the source changed or a required field is missing,
         | we rerun the jsonata generation and try to fix it.
         | 
         | I guess you could regularly run the api just to make sure the
         | mapping is still up to date and there are no delays when you
         | actually need the data, depending on how often the api changes.
        
       | asdev wrote:
       | why would use this when I can just add API docs to my LLM context
       | and have it generate the integration code?
        
         | nimar wrote:
         | because it's easier :) , see:
         | https://news.ycombinator.com/item?id=9224
        
         | sfaist wrote:
         | depends on your usecase: - this abstracts away a lot of the
         | complexity, including pagination and format conversion. Also
         | integrated logging and schema validation. - this is self-
         | healing, so when data comes through that you have never seen
         | before or if the api changes it is a lot less likely to break.
         | - if you need to integrate a lot of APIs, or if you have
         | multiple apps needing access to these apis, it is much easier
         | to set up here than writing 1000s of lines of integration code.
         | If none of this is important / applies to you and the generated
         | code works well, then you could also just do that.
        
       | dboreham wrote:
       | Doesn't someone own a trademark in that general area?
        
         | adinagoerres wrote:
         | not that we're aware of!
        
       | tayloramurphy wrote:
       | Does this have any connection to the previous "Supaglue" startup
       | [0]? Similar problem space, slightly different/pre-llm solution.
       | 
       | [0] https://docs.supaglue.com/
        
         | adinagoerres wrote:
         | we're not affiliated
        
       | AvImd wrote:
       | Access to XMLHttpRequest at 'https://graphql.superglue.cloud/'
       | from origin 'https://app.superglue.cloud' has been blocked by
       | CORS policy: No 'Access-Control-Allow-Origin' header is present
       | on the requested resource.
        
         | sfaist wrote:
         | Thanks for flagging this. Odd. Did this happen on the website
         | or in the actual app? Might be a server overload looking at our
         | logs.
        
       ___________________________________________________________________
       (page generated 2025-02-27 23:00 UTC)