[HN Gopher] Launch HN: Integuru (YC W24) - Reverse-engineer inte...
___________________________________________________________________
Launch HN: Integuru (YC W24) - Reverse-engineer internal APIs using
LLMs
Hey HN! We're Richard and Alan from Integuru (https://integuru.ai).
We build low-latency integrations with platforms lacking official
APIs. We take custom requests and manage creation, hosting, and
authentication. To automate our work, we built an open-source AI
agent that reverse-engineers internal APIs to generate integration
code. Here's a demo: https://www.youtube.com/watch?v=7OJ4w5BCpQ0.
Many products need integrations with third-party services, but
platforms often lack official APIs. Examples include logistics
software, financial services, electronic health records (EHRs), and
government websites. To build low-latency integrations, developers
must reverse-engineer internal APIs, but this can get complicated.
With Integuru, you can have easier access to integrations. We
started as recent college grads trying to make US income tax data
accessible. We contacted banks, brokerages, payroll software, and
more to request access to their APIs, but none took us seriously.
We resorted to building integrations with these systems to extract
documents like W-2s and 1099s. We initially used browser automation
but ran into two big problems: our integrations (1) weren't
reliable due to changing UIs and (2) had slow execution speeds due
to spinning up browsers and waiting for pages to load. We
experimented with AI-based automation maintenance, but it didn't
solve slow speeds. So, we concluded that browser automation is
useful when execution speed isn't essential, but reverse
engineering is often the only path for performant integrations.
Through reverse-engineering dozens of platforms, we noticed many
internal API design patterns that LLMs could decipher. We built an
agent to automate the creation of integrations. Today, Integuru can
analyze a platform's internal API designs and build an integration
in minutes. The agent mimics what a human does when reverse-
engineering. Say you want to download utility bills from a utility
website. You'd first use Integuru to generate a file of network
requests and a file of cookies. You pair these two files with a
prompt about your desired action--in this case, to download utility
bills. Integuru identifies the final request that downloads
utility bills. The request URL might look like this:
https://www.example.com/utility-bills?accountId=123&userId=4.... It
then identifies parts of the request that depend on other requests.
The example URL contains dynamic parts-- accountId and userId--that
usually are in the response of previous request(s). It then finds
other requests whose response contains any of these and adds them
to the dependency graph. The newly found request URLs might look
like https://www.example.com/getAccountId,
https://www.example.com/getUserId, and so on. This process repeats
until the most recently found request doesn't depend on any other
request. Integuru then traverses up the graph, starting from the
requests without dependencies while converting each request into a
runnable function. Integuru supports a surprising number of use
cases like downloading documents, sending money, creating virtual
cards... People already use the agent to build low-latency APIs for
platforms like Robinhood, transportation management systems (TMS),
and more. However, the agent still has limitations due to current
LLM capabilities and long-tail edge cases, but we've been giving
each platform to the agent for the first try. When the agent does
struggle, we find the generated graphs and code still helpful as
references for us to complete the work manually. The agent and all
integrations are open-source under AGPL-3.0. We charge for services
to (1) build custom integrations when the agent struggles or for
your convenience, (2) handle hosting, and (3) manage authentication
using authentication cookies from authenticated browser sessions.
We charge per API call with an implementation fee for new
platforms. We're currently working to increase the agent's
coverage and improve code generation. We will continue to iterate
and want to one day allow developers to integrate with all
platforms instantly. Integuru is still an early effort. We're
passionate about automating integrations and would love your
feedback!
Author : richardzhang
Score : 136 points
Date : 2024-10-29 13:00 UTC (9 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| dewey wrote:
| If your landing page doesn't look like this, you've launched too
| late: https://integuru.ai
| qsort wrote:
| Literally peak graphics.
| ocean_moist wrote:
| I wish I could do this... best part of building for devs is
| being able to provide simple, good UX with minimal UI.
| geoctl wrote:
| Still looks more interesting than that Next.js landing page
| template used by every startup these days.
| ramenlover wrote:
| I don't know what my PM would say but to me this is "excellent
| and appealing design"
| bryant wrote:
| Page source is amazing. I can't remember the last time I've
| seen a serious YC company launch page with absolutely zero
| JavaScript. Even the CSS is just a single selector.
|
| I'm a fan.
| silvanocerza wrote:
| Their website is this one though. :) https://www.taiki.ai/
| swyx wrote:
| @richardzhang what is the relationship between taiki and
| integuru? is this a pivot?
| richardzhang wrote:
| We should definitely further clarify this! We built
| Integuru as an internal tool while building the products
| for Taiki. Then we realized that other developers may need
| the agent, too, so we decided to open-source Integuru. In
| terms of the current focus for our team, we are spending
| most of our time on Integuru because newly requested
| integrations take some of our resources to build, and we
| want to continue improving the agent. I think the correct
| way to frame this is a market expansion, where we're
| expanding beyond the tax industry.
| btbuildem wrote:
| This is what happens when your daily grind is cutting through
| all kinds of atrocious and excessive "web design" in order to
| get at information.
| Prosammer wrote:
| Very cool, congratulations! Would this work for graphql APIs with
| introspection disabled?
| alanloo wrote:
| Thank you! As long as the network request contains the query,
| it should work as expected. So yes it should work with
| introspection disabled graphQL APIs. Excited to see what you do
| with it!
| toomuchtodo wrote:
| Brilliant. Is the next part to monitor and autocorrect breakage
| when the API in scope changes unexpectedly underneath the system?
| This is a pain point of workflow automation systems that
| integrate with APIs in my experience, typically requiring a human
| to triage an alert (due to an unexpected external API change),
| pause worker queues, ship a fix, and then resume queue
| processing.
|
| Love the landing page, please keep it.
| alanloo wrote:
| Thanks and yes that's part of the roadmap!
|
| Currently you need to trigger the UI actions manually to
| generate the network requests used by Integuru. But we're
| planning automate the whole thing by having another agent auto-
| trigger the UI actions to generate the network requests first,
| and then have Integuru reverse-engineer the requests.
| _hl_ wrote:
| This is awesome, but I'm not sure what the long-term use case for
| the intersection of low-latency integration and non-production-
| stable is? I'm saying this as someone with way more experience
| than I'd like to in using reverse-engineered APIs as part of
| production products... You inevitably run into breakages,
| sometimes even actively hostile platforms, which will degrade
| user experience as users wait for your 1day window to fix their
| product again.
|
| Though I suppose if you can auto-fix and retry issues within
| ~1minute or so it could work?
| alanloo wrote:
| This is a very important question. Thank you for bringing this
| up! Currently it requires human intervention to auto-fix
| integrations as someone needs to trigger the correct network
| request. We are planning on having another agent that triggers
| the network requests through interacting with the UI and then
| passing the network request to Integuru.
| lo0dot0 wrote:
| New pipe breaks regularly. It's almost like YouTube changes the
| API on purpose to hurt 3rd party clients that don't show ads.
| miki123211 wrote:
| Either that, or they just straight up don't care.
|
| I think it's pretty likely that they just don't look at or
| test Newpipe when they change their APIs. If the change
| doesn't break any official clients, it goes through.
|
| With how large Youtube is, I iimagine API changes are not
| infrequent.
| shmatt wrote:
| I just noticed over the weekend new Claude agreed to reverse
| engineer a graphql server with introspection turned off,
| something Im pretty sure it would have refused for ethical
| reasons before the new version
|
| it kept writing scripts, i would paste the output, and it would
| keep going, until it was able to create its own working discount
| code on an actual retail website
|
| The only issue with these kinds of things is breaking robots.txt
| rules and the possibility things will break without notice, and
| often
|
| The use of unofficial APIs can be legally questionable [1]
|
| [1] https://law.stackexchange.com/questions/93831/legality-of-
| us...
|
| As the authors of essentially a hacking tool, I would expect at
| least some legal boilerplate language about not being liable
| richardzhang wrote:
| We are working on a way to auto-patch internal APIs that change
| by having another agent trigger the requests.
|
| Regarding the legality aspects -- really appreciate you
| mentioning this -- we've put a lot of thought into these
| issues, and it's something we're continually working on and
| refining.
|
| Ultimately, our goal is to allow each developer to make their
| own informed decision regarding the policies of the platforms
| that they're working with. There are situations where
| unofficial APIs can be both legal and beneficial, such as when
| they're used to access data that the end user rightfully owns
| and controls.
|
| For our hosted service, we aim to balance serving legitimate
| data needs with safeguarding against bad actors, and we're
| fully aware this can be a tricky line to navigate. What this
| looks like in reality would be to prioritize use cases where
| the end-user truly owns the data. But we know this is not
| always black-and-white, and will come up with the right legal
| language as you recommended. What does help our case is that
| many companies are making unofficial APIs for their own
| purposes, so there are legal precedents that we can refer to.
| shmatt wrote:
| I have to disagree, it is definitely not legal in the US to
| use unauthorized access points to access authorized data.
| Thats like saying you're allowed to get into your apartment
| through breaking your neighbors door and climbing between the
| windows
|
| In the US this is pretty simply covered by Computer Misuse
| Act and Computer Fraud and Abuse Act, both federal laws
|
| Im not claiming you're liable, just surprised no lawyer
| pointed this out at YC
| daveguy wrote:
| This analogy is completely off. A closer analogy is someone
| calls you on your phone letting you know they're here. You
| were expecting them, so you say "come on in." But, they
| were at the back door instead of the front door. I don't
| think anyone would consider that your friend did something
| illegal.
| korkybuchek wrote:
| Yeah, the CFAA doesn't work by analogy unfortunately.
| erohead wrote:
| CFAA has recently (2021) been limited by Van Buren
| ruling.
| conradev wrote:
| There is a carve out if the data is "publicly available":
| https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn
|
| If I open the Safeway app and it fetches what is available
| in a given store without any authentication and everyone
| sees the same data, that could possibly fall under that
| exemption.
| chatmasta wrote:
| If my browser is downloading some data, then what's the
| difference if my AI agent is doing the same? I'll even tell
| you it's my browser. Who are you to say what qualifies as a
| browser?
| jerrygenser wrote:
| Will this work for SSR applications? e.g. think old school net or
| jsp apps which make network requests then receive HTML which then
| needs to be parsed in order to understand the key pieces of
| information and then additional network requests?
|
| I've found it relatively straight forward to reverse engineer SPA
| requests however with server side rendered apps, yow would your
| service handle that?
| toomuchtodo wrote:
| Would be cool to use a proxy to MITM to twiddle the bits (with
| its own API) if the use case isn't supported by a browser or
| robotic process automation driving the app's client side UX.
| jerrygenser wrote:
| I was talking about web apps. But yeah, for old school
| desktop apps or windows native proxy MITM works
| alanloo wrote:
| Good question. Finding the request that's responsible for the
| action you want will be a bit trickier for SSR, but it's still
| possible for most cases. It auto-generates regex (for now) to
| parse out needed info out of the html template.
| jerrygenser wrote:
| Another thing I've seen is that some of these old school apps
| are sending certain requests that don't modify the page but
| set server side context which subsequent requests are
| dependent on.
|
| For example, set context to a particular group and then
| subsequent navigation depends is filtered on that group even
| though the filter is not explicit on the client side but due
| to state stored in the session remotely.
|
| This can also have implications on concurrency for a given
| session where you need to either create separate sessions or
| make sure there is some lock on particular parts of server
| side state.
|
| Would this type of this eventually be possible? Or at least
| hooks in able for us to add custom code such as session locks
| alanloo wrote:
| Very interesting to hear about your experience here! We
| haven't come across a website that has this design and
| don't offer support for this just yet. We can certainly
| implement if more people face a similar situation.
| nkotov wrote:
| This is really awesome. There's several platforms that
| intentionally gate keep their API and it makes really annoying to
| build integrations with them. How do you go about these platforms
| and not breaking their TOS?
| compootr wrote:
| I don't think it really matters to them. As a provider giving
| access to these platforms, they're not the user (and they
| didn't agree to the terms). the end user did, so it's on them
| to decide whether they risk getting terminated or whatnot
| DougWebb wrote:
| If they have deeper pockets than the user, they're the ones
| who will get sued for abuse they enable.
| richardzhang wrote:
| Thank you! There are definitely platforms that intentionally
| gate-keep their APIs. A good example is LinkedIn, which many
| companies still try to force-build their own integrations with.
| Our goal is to allow each developer to make their own informed
| decision regarding the policies of the platforms that they're
| working with. For our hosted service, we want to prioritize use
| cases where the end-user truly owns the data. We can also refer
| to legal precedent cases where many other companies make
| unofficial APIs.
| rumpelstilzchen wrote:
| Nice work, congrats! How do you deal with security related stuff
| like recaptcha, signed requests and so on?
|
| Do you also support internal APIs of mobile applications? If so,
| how do you deal with AppCheck / PlayIntegrity / Android Key
| Attestation / Apple App Attest?
| alanloo wrote:
| Thank you! Integuru itself doesn't handle recaptchas and signed
| requests, but we have a hosted solution where we use third-
| party services to handle recaptchas and manually create
| integrations for handling signed requests.
|
| We do not directly support APIs for mobile applications;
| however, if you use MITM software and get all the network
| requests into a .har file, Integuru should work as expected. We
| do not handle AppCheck ATM at the moment unfortunately.
| blakeburch wrote:
| Really digging this idea.
|
| I've spent plenty of time trying to dig into the network tab to
| automate requests to a website without an API. Cool to see the
| process streamlined with LLMs. Wishing you all the best of luck!
| richardzhang wrote:
| Thank you!
| bstanfield15 wrote:
| Hell yeah! Love to see this launch. We have spent a lot of time
| at Wren recently trying to reverse engineer some local law APIs
| to help make renewable energy developer lives easier (less
| parsing through hundreds of PDFs, dead links, etc.) -- going to
| try this out and see if it can speed up our workflow.
| richardzhang wrote:
| Thank you! Would love your feedback after you use it!
| imranq wrote:
| Wow this is great! I think this is kind of the future of
| automation and "computer use" once LLMs become powerful enough.
|
| Every task on the web can be reduced down to a series of backend
| calls, and the key is extracting out the minimal graph that can
| replicate that task.
| richardzhang wrote:
| Thank you!
| mdaniel wrote:
| Ah, by clicking on the Taiki logo to see what the ... parent
| company? ... builds, I now understand how this came about. And
| I'll be honest, as someone who hates all that tax paperwork
| gathering with all my heart, this launch may have gotten you a
| new customer for Taiki :-)
|
| Also, just as a friendly suggestion, given what both(?) products
| seemingly do, this section could use some love other than "we use
| TLS":
| https://www.taiki.ai/faq#:~:text=How%20does%20Taiki%20handle...
| since TLS doesn't care about storing credentials in plain text in
| a DB, for example
|
| ---
|
| _p.s. the GitHub organization in your .gitmodules is still
| pointing to Unofficial-APIs which I actually think you should
| have kept o /_
| alanloo wrote:
| Thank you for your suggestions, and really glad to hear you're
| excited about Taiki! We will update the the FAQ with your
| suggestions -- honestly, this part of the website is a bit
| outdated, and we will make sure to change it.
|
| Regarding the Unofficial-APIs name, it was a really tough
| decision. We liked the name a lot but just thought it was a bit
| long. A Real pleasant surprise that you found it :)
| andrewski77 wrote:
| congratulations! this is such a cool idea
| richardzhang wrote:
| Thank you!
| kevo1ution wrote:
| going from tax api reverse engineering to making it easier to
| reverse engineer any API is smart pivot
___________________________________________________________________
(page generated 2024-10-29 23:00 UTC)