[HN Gopher] MitmProxy2Swagger: Automagically reverse-engineer RE...
       ___________________________________________________________________
        
       MitmProxy2Swagger: Automagically reverse-engineer REST APIs
        
       Author : AbuAssar
       Score  : 513 points
       Date   : 2025-01-02 08:08 UTC (14 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | Gamemaster1379 wrote:
       | This is a nice tool. A game I liked to play announced end of
       | service back in 2023. They gave enough notice to let me capture
       | some logs from their cooridinator service.
       | 
       | I captured them in mitmproxy and ran those through this to help
       | me identify all the endpoints and their general structure. (A few
       | things were a misnomer, like the examples suggesting certain
       | values were able to be floats when they could only be integers)
       | 
       | I was able to get a team together and we were able to stand up
       | private servers as a result.
        
         | simonjgreen wrote:
         | Amazing! What game was this for? I was involved in the RE
         | efforts around UO way back in the day.
        
           | kirici wrote:
           | Gundam Evolution, going by comment history.
        
             | ge96 wrote:
             | Different plot/game mechanics but armored core 6 is great
             | if you like mecha
        
       | andrewstuart wrote:
       | This is something that would be easy to do an ordinary job of,
       | missing lots of edge cases and not making something thorough and
       | complete.
       | 
       | A really professional and thorough job would be extremely time
       | consuming and hard.
        
         | matthewolfe wrote:
         | I do this a lot for my work. A tool like this that can help get
         | me to a nice starting point is huge. Instead of developing a
         | mental model of the API in my head by manually looking through
         | API requests/responses in ProxyMan, this can start me off much
         | more quickly. From there, the edge cases can be worked out.
        
       | zython wrote:
       | This is so cool. Thanks for sharing !
        
       | tinchox5 wrote:
       | Coool!
        
       | zebomon wrote:
       | I looked through this earlier today when I saw it mentioned in
       | that thread about the closed source tool for the same purpose.
       | 
       | Having done a good bit of this type of reverse engineering the
       | hard way over the years, it's a very exciting find. I had been
       | talking with my partner about building something similar for the
       | past six months. How exciting to learn that it's already out
       | there and open source too!
        
       | colesantiago wrote:
       | Again, this is the very easy part of the reverse engineering API
       | process that most tools can do, similar to API Parrot and the
       | rest of them. This is not hard to do.
       | 
       | The hard part is that inevitably, all these internal APIs will
       | just add aggressive CAPTCHAs, Device Check, fingerprinting, etc
       | to prevent common drive by re'ing. Easy to add these on the
       | defence side, and extremely difficult to bypass on the other
       | side.
       | 
       | I can imagine all developer teams now upping their security with
       | the combination of the above mentioned to prevent this.
        
         | sebmellen wrote:
         | Depends on the age of the tool. We work with a lot of legacy
         | systems that actually _want_ us to integrate with them but
         | don't have the dev resources to build a proper API surface. As
         | a result, we end up doing a lot of painful reverse engineering.
         | These tools look promising for purposes like this.
        
         | devjab wrote:
         | I curious as to why people would have a public API to begin
         | with if they wanted to protect it from people using it. Then
         | again, why would anyone have a public undocumented API in 2024
         | when a LLM can give you a cli tool to auto-generate 90% of the
         | OpenAPI spec in a couple of hours? The last question isn't
         | serious, I've worked in enterprise for decades and almost none
         | of the tools organisations end up buying have good
         | documentation for their API's. Not that those are publicly
         | available, but still.
        
           | lesuorac wrote:
           | I think you have a misunderstanding here.
           | 
           | The API needs to be "public" because the app uses the
           | internet to communicate back to the home server.
           | 
           | The API is not "public" in the sense that the app developers
           | want anybody to use it; they just want their app to use this
           | API. So they don't write publicly accessible documentation
           | about it because they don't want to encourage its use.
           | 
           | A tool like MitmProxy2Swagger lets you run the app and record
           | all of its API calls so that you can use this unadvertised
           | API.
        
         | mad_vill wrote:
         | There are many cases where users are behind a forward proxy for
         | security/compliance reasons. Most applications need to support
         | these types of users.
        
         | jampekka wrote:
         | Making a mitmproxy dump from a manual browsing session is more
         | or less unblockable, barring some TPM or similar fuckery.
         | 
         | Usage of the API even with the protocol known OTOH can be quite
         | easily made really hard.
        
       | notcrazylol wrote:
       | I was wondering how it would take in graphql endpoints and
       | convert it to swagger, since its just a single POST API with
       | change in params. But thats more of a swagger issue than the
       | tools. Has anyone dealt with this? Would be really helpful if you
       | could share your ideas too :)
        
         | asabla wrote:
         | Why would you tho?
         | 
         | If you're working against an GraphQL based API, you should be
         | able to pull a schema file. And use that to implement your own
         | API.
         | 
         | All you would get from an Mitmproxy is example queries and
         | mutations. With the additional complexity of extra tooling to
         | stich together the schema file
        
           | jampekka wrote:
           | Pulling the schema file can, and often is, disabled server
           | side. And GraphQL APIs can, and often do, decline to serve
           | other than persisted queries, and those can't be really
           | inferred even with known schema.
        
       | swyx wrote:
       | did i miss something or why are there TWO (2) "magically reverse
       | engineer REST APIs" projects on the HN front page right now? is
       | there some offline beef going on?
       | 
       | (screenshot in case this goes away
       | https://x.com/swyx/status/1874762725383188502)
        
         | Quarrel wrote:
         | Presumably, because the closed source one got some traction, so
         | people are pointing out the open source alternative.
        
         | littlestymaar wrote:
         | Likely because of this comment[1] in the other thread which
         | made people submit this link, and when multiple independent
         | people submit the same link in a short period of time you're
         | very likely to end up on the front page (this exact situation
         | happened to me once)
         | 
         | [1] https://news.ycombinator.com/item?id=42568121
        
           | AbuAssar wrote:
           | Yeah, that's where I got the link from.
        
         | TechDebtDevin wrote:
         | Noone asked for your twitter.
        
           | jereees wrote:
           | I put swyx up there with sama in the category of extremely
           | smart people that give me the ick for reasons I cannot
           | articulate
        
       | youngNed wrote:
       | perhaps a n00b question, but would this work, or is there
       | something similar for apps, specifically android apps?
        
         | rhaps0dy wrote:
         | Depends on the app. If it uses some online functionality
         | probably yes. You could also try decompilation, it's decent on
         | java apps like android's.
        
         | whilenot-dev wrote:
         | A MITM proxy isn't specific to any app, it's a forward proxy
         | for your outgoing network connection. In case of an Android app
         | you'd need to run mitmproxy on a machine in your network and
         | setup the connection as proxy in your Android's network
         | settings. Then you'd need follow http://mitm.it to install
         | mitmproxys root certificate on the Android device (to trust the
         | connection with TLS) and off you go.
         | 
         | EDIT: or rather follow the docs[0]
         | 
         | [0]: https://docs.mitmproxy.org/stable/howto-install-system-
         | trust...
        
         | tecleandor wrote:
         | I've used this specific tool to help me reverse engineer the
         | private API of an Android App.
         | 
         | The thing is, depending on how hardened the app is, you'll have
         | to play with Android to allow this interception, mostly because
         | of certificate pinning. Also I remember something about apps
         | not using the system wide trusted certificates you install
         | (IIRC).
         | 
         | I remember using a rooted device with LineageOS, and
         | downloading the APK and modifying it with a tool so the self
         | signed certificate for the mitm proxy works with it.
         | 
         | The mitm proxy docs have some links to tools that can do that
         | [0] and you could also use an Android emulator if you don't
         | have an extra phone to mess with it [1]                 0:
         | https://docs.mitmproxy.org/stable/concepts-certificates/
         | 1: https://docs.mitmproxy.org/stable/howto-install-system-
         | trusted-ca-android/
        
         | jazz9k wrote:
         | I use burp suite combined with Frida (which can remove root
         | check and override ssl pinning).
        
           | nsteel wrote:
           | Yes, this. The Frida tools method to remove cert pinning is
           | the only method that has worked for me. The mitmproxy docs
           | for android (as referred to by another commenter) didn't work
           | for any apps I tried.
        
       | construct0 wrote:
       | Yeah - does this get nullabilities right?
        
       | tecleandor wrote:
       | I've used this tool in the past with success. Not perfect but it
       | accelerates the work greatly if you can launch a mitm proxy
       | quickly and are familiar with the tool.
       | 
       | I've been fighting lately with an API, though. It's not very,
       | let's say, RESTy. It has only one endpoint, and the different
       | "sections" of the API are defined in parameters, so
       | MitmProxy2Swagger doesn't detect them properly :(
        
         | nejsjsjsbsb wrote:
         | Nothing is RESTy
        
         | quectophoton wrote:
         | > It's not very, let's say, RESTy. It has only one endpoint,
         | 
         | To be fair, from what I understand an actual(tm) REST API would
         | only have a single defined endpoint[1]: the entry point. With
         | every other endpoint being discovered from the responses. And
         | also from your message I'm guessing a URI still uniquely
         | identifies a resource (specifically through the "query" part of
         | the URI, instead of the more common "path").
         | 
         | So, _technically_ , assuming there's nothing too weird with
         | that API, it seems like MitmProxy2Swagger is failing to detect
         | a REST API.
         | 
         | [1]: Corollary: If an API is RESTful, it should be possible to
         | rename any endpoint (except the entry point) at any moment in
         | time without prior notice, and clients would not break as long
         | as the response types/schemas are still supported by the
         | clients. In-flight requests might fail with a 4xx, but after a
         | retry they should go to the correct endpoint without any code
         | change required.
        
           | zdragnar wrote:
           | This is HATEOAS, basically the core feature of REST that very
           | few people actually use. Most of what the industry calls REST
           | or RESTful is just structured and inefficient RPC.
        
           | pests wrote:
           | I don't think anyone has ever used REST in the way you are
           | using it - the sibling comment points out that HATEOAS is
           | probably what you mean - this generally embeds links to all
           | resources, full data navigation, next/prev links, and so on.
           | It is true that a proper HATEOAS client should be able to
           | navigate an endpoint completely with just a starting address.
        
       | srameshc wrote:
       | Obvious question: How to protect against this ?
        
         | smallnix wrote:
         | What specifically do you want to protect?
        
         | K0nserv wrote:
         | Your first line of defence should be a secure API where an
         | attacker doesn't gain anything by knowing it.
         | 
         | You can add obfuscation, but ultimately if the client is
         | shipped to the user you must assume an attacker can reverse
         | engineer it.
        
         | mathgeek wrote:
         | Build your API assuming anything public facing will be known.
         | This includes anything downloaded to a device.
        
         | bandrami wrote:
         | I find this confusing because the point of an API is to be
         | known, yes? Otherwise who's accessing it?
        
           | quesera wrote:
           | It's a valid desire, but you have to be really dedicated to
           | the effort to block it, in practice.
           | 
           | You might intend your API to be consumed only by your own
           | clients. E.g. your published mobile apps.
           | 
           | A well-designed API won't allow a third-party client to do
           | anything that your own client wouldn't allow of course.
           | Permissions are always enforced on the back end.
           | 
           | But there are many cases where a user might want a
           | custom/different client:
           | 
           | If your mobile apps are not awesome, or if they deprioritize
           | a specific use case, _or if they serve ads_ ... or even if
           | your users want to automate some action in your service...
           | 
           | If your service is popular enough (or you attract a certain
           | kind of user), you will have some people building their own
           | clients.
        
             | bandrami wrote:
             | Those sound like bad use cases for a client-server model
             | with public endpoints, then? I mean, you could cert-pin
             | yourself in the client app, I guess.
        
           | kube-system wrote:
           | Not necessarily. A common pattern is to build a 'private API'
           | intended to be used by one's own front-end applications. For
           | example: most client-rendered applications, like the Airbnb
           | example on this page.
        
       | mkagenius wrote:
       | If only someone could automate[1] the clicking and navigating
       | part by writing in plaintext something like "Open airbnb and
       | explore all the features as much as possible" :)
       | 
       | 1. https://github.com/BandarLabs/clickclickclick - It does that
       | and I am one of the authors.
        
       ___________________________________________________________________
       (page generated 2025-01-02 23:00 UTC)