[HN Gopher] Show HN: Mitmproxy2swagger - Automagically reverse-e...
       ___________________________________________________________________
        
       Show HN: Mitmproxy2swagger - Automagically reverse-engineer REST
       APIs
        
       Author : alufers
       Score  : 645 points
       Date   : 2022-05-12 13:49 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jwong_ wrote:
       | Really neat! Gives me an idea on using something like this to
       | generate e.g., CURL commands to mimic SSO flows.
       | 
       | Even just documenting an SSO flow as a diagram would be quite
       | neat.
        
         | john-tells-all wrote:
         | Note that for single resources, Chrome/Edge can do this now.
         | There's a semi-hidden "copy this resource as Curl" option:
         | 
         | https://everything.curl.dev/usingcurl/copyas#:~:text=From%20...
         | .
         | 
         | When it works, it's effing magic! Spectacular for very quickly
         | knocking out Bash scripts that test multiple APIs.
        
       | a-dub wrote:
       | lol!
       | 
       | step 2: features for training a language model on the request and
       | response variables in the mitm stream and a shim for standing up
       | a fully ml data driven zero code mock backend.
        
       | POPOSYS wrote:
       | Can we have this as a browser dev tool please? F12 -> Tab REST ->
       | Create spec from API
        
       | h1fra wrote:
       | very nice !
        
       | ducktective wrote:
       | Is it possible to do this on wireshark/tcpdump pcap dumps? Like
       | for finding out hostnames, endpoints and request packets of HTTPS
       | requests that an android app is making?
        
         | alufers wrote:
         | The problem with pcap is that whe requests there would be
         | encrypted and basically there is no way to practically decrypt
         | them.
         | 
         | Mitmproxy solves that by being between the client and server
         | and injecting it's own self-signed certificate (which you need
         | to add to the trusted certificates on the phone, which requires
         | root).
        
           | resoluteteeth wrote:
           | See SSLKEYLOGFILE
        
             | ducktective wrote:
             | explain bit more please? Do you mean root is not needed?
             | Isn't that a curl feature?
        
       | julianlam wrote:
       | This is great work!
       | 
       | This would come in very handy for codebases where an OpenAPI v3
       | spec would be welcome, but is too onerous to create by hand. Run
       | this for a bit, have it spit out a nearly complete spec, and
       | tweak it a bit to output the final product.
       | 
       | In fact, it is precisely what we did to generate the OpenAPI docs
       | for NodeBB [1]. We had an undocumented API that we turned into an
       | OpenAPI v3 file.
       | 
       | [1] https://docs.nodebb.org/api/read
        
       | mutant wrote:
       | This is absolutely phenomenal!
        
       | mro_name wrote:
       | awesome take
        
       | dnnssl2 wrote:
       | Starred. Does this work with non-emulated iOS or Android http
       | calls in which you may need to disable app level security?
        
         | jeroenhd wrote:
         | For Android you'll probably need root access (unless the app
         | developer has opted in to loading your user-imported
         | certificate authorities). For iOS this should be easier.
         | 
         | However, many apps apply cert pinning in production builds,
         | which will require tools like Frida to disable them, which in
         | turn requires root access/a jailbreak to function.
         | 
         | Alternatively, you could pull the apps from your phone without
         | root (at least on Android), patch the most obvious cert pinning
         | out (usually in the network manifest file) and install the new
         | version.
        
       | Sytten wrote:
       | Super nice! We might integrate something similar in Caido proxy.
        
       | aleksiy123 wrote:
       | Really awesome, I tried my hand at writing something similar and
       | was surprised at how well it actually ended up working.
       | 
       | I feel liken the next step is automatically generating load tests
       | and/or fuzzing tests. Felt like that could be a real product.
        
         | ludovicianul wrote:
         | Here you go: https://github.com/Endava/cats
        
       | Labo333 wrote:
       | Very nice!
       | 
       | On the same note, I wrote a program to generate Python code
       | (requests) from a HAR capture:
       | https://github.com/louisabraham/har2requests
       | 
       | I think using HAR captures is simpler for the end user than
       | spawning mitmproxy as they don't require any installation and are
       | extracted from the network tab of the browser devtools. Is there
       | a reason why you didn't use them?
       | 
       | EDIT: I realized that mitmproxy can also get traffic from other
       | devices like phones. Very cool project, I will think about
       | modifying mine to support mitmproxy captures!
        
         | olabyne wrote:
         | Oh, I used a python script to generate pre-made requests from
         | HAR recently, I'm pretty sure it was your git ! Very useful :)
        
       | klyr wrote:
       | Hi, I would also like to add another tool I'm contributing to at
       | work (cisco) called APIClarity [1]. It aims at reconstructing
       | swagger specifications of REST microservices running in K8S, but
       | can also be run locally.
       | 
       | This is a challenging task and we don't support OpenAPI v3 specs
       | yet (we are working on it).
       | 
       | Feel free to have a look, and get ideas from it :)
       | 
       | We'll also be presenting it at next Kubecon 2022.
       | 
       | [1]: https://github.com/openclarity/apiclarity
        
         | sohaibtariq wrote:
         | Try out https://www.apimatic.io/transformer/ for converting
         | Swagger Specs to OpenAPI
        
       | SemanticStrengh wrote:
       | Can this be used to generate a REST documentation for your own
       | frontend just by interacting with it? This should be augmented
       | via a crawler, that click everyclickable element recursively.
        
         | alufers wrote:
         | Totally, but you would need to do some manual cleanup and
         | naming afterwards to make it more useful than just reading the
         | source code. You could also for example use your integration
         | tests if you have some to capture as much routes as possible.
        
           | SemanticStrengh wrote:
           | of course the generated doc should be refined (e.g. filling
           | missing types, error codes) but your lib would save us a lot
           | of work and make the world a better place.
        
             | tomatowurst wrote:
             | _"...and we expect it to be free and open source as our
             | budget for this is zero. "_
        
               | SemanticStrengh wrote:
               | The relationship between actual utility/value and price
               | is only vaguely correlated. Many of the most useful
               | things on earth can't be marketed, not because they're
               | not worth the money but because people are extremely
               | greedy for some kinds of domains and simultaneously are
               | bad at realizing the impact on their lives. E.g I have
               | never spent a single dollar to access music despite being
               | one of the few things in life that brings me intense joy
        
               | tomatowurst wrote:
               | It's vaguely correlated because you don't value the work
               | of others in general. This means that at some point in
               | your life, others did not value your work and showed you
               | that was perfectly acceptable.
        
               | motoxpro wrote:
               | I'm glad I can subsidize your music hobby and that you
               | feel no sense of guilt for not supporting the people who
               | "bring you intense joy"
        
       | useful wrote:
       | bravo, I've wanted something like this
        
       | Divyeshkharade wrote:
       | This looks amazing. Will it also capture data types like
       | enumerators by someway detecting patters?
        
         | alufers wrote:
         | I thought about it, but it would be hard to distinguish between
         | an enumerator and just static data. For example if you logged
         | in with only one account it could classify the "username" field
         | as an enumeration, because there is only one captured value.
        
           | freedomben wrote:
           | Yeah I imagine that is nearly impossible without capturing
           | data at scale. Awesome tool! I'm super grateful :-)
        
       | difu_disciple wrote:
       | This is fantastic. Thank you
        
       | alufers wrote:
       | Wanted to show off my little project which helps whith reverse
       | engneering APIs used by various apps. It takes HTTP traffic
       | capturewd by mitmproxy and generates an OpenAPI specification for
       | a given REST API.
       | 
       | I have used it already on two apps and the results are good
       | enough to write an alternative client or quickly automate some
       | stuff.
        
         | upupandup wrote:
         | does it capture route/server rendered pages too?
        
           | alufers wrote:
           | It does, but it will only generate schema descriptions for
           | JSON endpoints. Whis means that the URL and method will
           | appear in the spec, but not the response/request schema.
        
         | ludovicianul wrote:
         | This is great :) You can then fuzz your APIs for issues using
         | https://github.com/Endava/cats.
        
         | mhils wrote:
         | mitmproxy dev here, very awesome! :) This seems to be
         | particularly useful to quickly generate clients for reverse-
         | engineered APIs.
        
           | mohsen1 wrote:
           | Swagger Editor dev which now works at Airbnb here. This is
           | hilarious!
        
             | SOLAR_FIELDS wrote:
             | Hilarious indeed! The first thing I thought of with this
             | project is actually AirBnB, because the sort/filter/map
             | view is so terrible and missing features. AirBnB captures
             | data on a bunch of stuff, but doesn't make it possible to
             | search for in the UI (ever want a property with a lake view
             | or a sauna? AirBnB knows which ones have those things, but
             | they won't let you look for them!)
             | 
             | AirBnB doesn't have an official API but changes the tags so
             | often that scrapers people put up on Github go out of date
             | quickly. Now I can run this whenever I want to have actual
             | search functionality (instead of the hobbled crap available
             | on the website) and ensure that whatever flavor of API is
             | available on the website that day is easily queryable!
        
               | metadat wrote:
               | How will this let you search for a sauna?
        
               | SOLAR_FIELDS wrote:
               | Easier to modify requests vs doing it using browser
               | tools. The ability to search for the things I mentioned
               | is actually there, but only via an undocumented url
               | parameter that erases itself every time you pan the map.
               | Doing it via REST calls is much easier than trying to do
               | it in the UI.
        
         | lancebeet wrote:
         | This is a really clever project. It seems like an obvious idea
         | once you've seen it, but it clearly isn't. Thank you for
         | sharing it.
        
         | anitil wrote:
         | What a fantastic idea! I have so many half baked things that
         | some idiot (me) built without documenting the underlying API.
         | This will make life so much easier
        
       | captn3m0 wrote:
       | Almost exactly a fit against my idea[1] to generate OpenAPI from
       | HAR files. Going to read through to see if I can add HAR support.
       | 
       | [1]: https://github.com/captn3m0/ideas#openapi-specification-
       | gene...
        
         | efitz wrote:
         | OpenAPI is just the latest version of swagger. Should not be
         | hard to change.
         | 
         | I was able to translate HAR to OpenAPI with this web site's
         | free preview: https://www.apimatic.io/transformer/
         | 
         | I also see others are working on the same thing:
         | https://github.com/dcarr178/har2openapi
        
           | kaidon wrote:
           | Also https://github.com/anbuksv/avantation
        
       | instagary wrote:
       | How did you bypass cert pinning in the video for the Airbnb app?
        
         | alufers wrote:
         | I didn't, just added a self-signed cert to my keychain on macOS
         | and launched the app as downloaded from App Store.
         | 
         | I guess Airbnb doesn't use cert pinning.
        
         | paxys wrote:
         | It doesn't have anything to do with mobile. The web client uses
         | the same APIs.
        
       | BWStearns wrote:
       | This is fantastic!
        
       | efitz wrote:
       | This is awesome; I'm going to try it as soon as I get back to my
       | desk. I've been working on trying to glue together tools to
       | translate Charles proxy output to OpenAPI (swagger). I think it
       | would be a great tool to have in a web app reverse engineering
       | toolbox.
        
       | eligro91 wrote:
       | Really amazing.
       | 
       | We're having hundreds of undocumented endpoints created over the
       | years, and running this tool on our backends will create
       | instantly good documentation
       | 
       | Thanks for that! Will give feedbacks if any issues
        
       | Cilvic wrote:
       | The question is maybe a bit off-topic a d vague. That's because I
       | struggle to express it with the right terms:
       | 
       | I'm looking for a generic tool to build and then serve:
       | 
       | Accept Incoming request (API contract A) Send outgoing request
       | (API contract B) potentially with parameters from the incoming
       | request Receiving incoming response (API contract B) Do some
       | translations/string manipulation Send outgoing response (API
       | contract A)
        
         | jeroenhd wrote:
         | mitmproxy (https://mitmproxy.org/) has scripting support that
         | will let you do most of this.
         | 
         | For example, you can expose mitmproxy, listen to HTTP requests
         | for a specific host (using this API:
         | https://docs.mitmproxy.org/stable/api/mitmproxy/http.html),
         | intercept the request, do whatever API calls you need, and
         | inject a response without ever forwarding the request to the
         | original server.
         | 
         | Alternatively, you could modify the request and then change the
         | request destination, like in this example here:
         | https://docs.mitmproxy.org/stable/addons-examples/#http-
         | redi.... Using the WSGI support, you could even use normal
         | Python annotations to build your own API without doing too much
         | pattern matching: https://docs.mitmproxy.org/stable/addons-
         | examples/#wsgi-flas...
        
           | Cilvic wrote:
           | Ok. This sounds great for easy developing. But when I'm
           | hosting this I'm not a mitmproxy. I want to act like a normal
           | server/endpoint for API A.
        
             | jeroenhd wrote:
             | I don't know any libraries for this in any good backend
             | languages, but I've worked with these packages in NodeJS to
             | do something like that:
             | 
             | - https://www.npmjs.com/package/http-proxy
             | 
             | - https://www.npmjs.com/package/connect
             | 
             | - https://www.npmjs.com/package/harmon
             | 
             | If you don't want to act like a proxy, you're going to
             | approach this like a normal web applications that does HTTP
             | requests using whatever HTTP client your framework of
             | choice uses.
        
       | dsfiguer wrote:
       | Oh I love this so much! This would help me with scraping certain
       | sites.
        
       | chrisweekly wrote:
       | Awesome idea! Thank you for creating and sharing!
        
       | andrewstuart2 wrote:
       | I've always wanted to build something similar to this, by reading
       | HAR files captured right out of the devtools. Have you given any
       | thought to that as an alternative input?
        
       | dudus wrote:
       | This is a great idea. Kudos.
        
       | jeroenhd wrote:
       | Very interesting! Would this also be able to determine what kind
       | of auth (header tokens, cookies, etc) the APIs require or is that
       | something you still need to detect manually?
        
         | alufers wrote:
         | At this point yes, but I am working on adding this.
        
       | oneweekwonder wrote:
       | little bit off-topic, but do anybody know of something similar
       | for soap/wsdl? I'm aware of soapui mock service.
        
         | alufers wrote:
         | Doesn't wsdl just expose the schema on the server?
        
           | efitz wrote:
           | WSDL and OpenAPI/Swagger solve similar problems.
           | 
           | Roughly speaking: WSDL is to XML web services as OpenAPI is
           | to REST
           | 
           | They both model the API and message structure of an API.
           | AFAICT WSDL goes a little farther in that you can declare
           | message sequences (I might be giving short shrift to OpenAPI
           | here).
        
             | flatiron wrote:
             | Short of "this requires oauth" I think you are right about
             | openapi
        
       | andrewstuart wrote:
       | Be interesting to run a fuzzer on the API whilst doing this.
        
       | upupandup wrote:
       | this is absolutely insane!!! I understand capturing the REST api
       | network part, is it then examining the request body, headers
       | being sent back and forth to figure out the API?
        
         | alufers wrote:
         | Yes, this is basically what this program does.
        
           | ninkendo wrote:
           | From what I understand it's also somewhat how JIT works in
           | various JavaScript engines: observe the sorts of objects
           | (which naively have the performance characteristics of hash
           | tables) you see, and start defining static offsets for fields
           | you observed. The JIT'd (fast) objects may morph over time as
           | new fields are observed, but I'd imagine it's a similar idea
           | to creating documentation... "this object tends to have these
           | fields, so just pretend those are the only fields it can
           | have, until another request proves otherwise", with similar
           | guess/checking for their types/etc.
        
       | renewiltord wrote:
       | This is great. Good example too since Airbnb could use with some
       | improvement to the user chrome: include cleaning fees, etc
        
       | thefilmore wrote:
       | This is one of the most clever projects I've seen in a while.
       | Nice work.
        
       | nickysielicki wrote:
       | This is really incredible. With a rooted android phone and these
       | tools, plus a couple others [1,2,3], you can get a skeleton to
       | implement a backend for any app you want.
       | 
       | [1]: https://github.com/koxudaxi/fastapi-code-generator
       | 
       | [2]: https://github.com/ioxiocom/openapi-to-fastapi
       | 
       | [3]: https://infosecwriteups.com/hail-frida-the-universal-ssl-
       | pin...
        
         | [deleted]
        
         | andreidd wrote:
         | That's interesting, but it won't work with native code that
         | statically links a SSL implementation.
        
           | jeroenhd wrote:
           | In many applications you can bypass built-in verifications
           | with some Frida [1] code. It requires more effort to do so,
           | of course, as you'd need to find the OpenSSL methods (with a
           | script like this [2] and bypass the verification in there.
           | 
           | If you're really intent on getting it to work, downloading
           | the binary, patching out the verification function and
           | putting it back is also possible if you're root.
           | 
           | [1]: https://frida.re/docs/android/
           | 
           | [2]: https://mobsecguys.medium.com/exploring-native-
           | functions-wit...
        
       ___________________________________________________________________
       (page generated 2022-05-13 23:03 UTC)