hngopher.com

       [HN Gopher] Show HN: Auto-generate an OpenAPI spec by listening ...
       ___________________________________________________________________
        
       Show HN: Auto-generate an OpenAPI spec by listening to localhost
        
       Hey HN! We've developed OpenAPI AutoSpec, a tool for automatically
       generating OpenAPI specifications from localhost network traffic.
       It's designed to simplify the creation of API documentation by just
       using your website or service, especially useful when you're
       pressed for time.  Documenting endpoints one by one sucks. This
       project originated from us needing it at our past jobs when
       building 3rd-party integrations.  It acts as a local server proxy
       that listens to your application's HTTP traffic and automatically
       translates this into OpenAPI 3.0 specs, documenting endpoints,
       requests, and responses without much effort.  Installation is
       straightforward with NPM, and starting the server only requires a
       few command-line arguments to specify how and where you want your
       documentation generated ex. npx autospec --portTo PORT --portFrom
       PORT --filePath openapi.json  It's designed to work with any local
       website or application setup without extensive setup or
       interference with your existing code, making it flexible for
       different frameworks. We tried capturing network traffic on Chrome
       extension and it didn't help us catch the full picture of backend
       and frontend interactions.  We aim in future updates to introduce
       features like HTTPS and OpenAPI 3.1 specification support.  For
       more details and to get started, visit our GitHub page
       (https://github.com/Adawg4/openapi-autospec). We also have a
       Discord community (https://discord.com/invite/CRnxg7uduH) for
       support and discussions around using OpenAPI AutoSpec effectively.
       We're excited to hear what you all think!
        
       Author : adawg4
       Score  : 92 points
       Date   : 2024-03-25 15:49 UTC (7 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | 0cf8612b2e1e wrote:
       | Could this work with previously saved traffic dumps?
        
         | leoqa wrote:
         | Just replay into nc?
        
       | DidYaWipe wrote:
       | This is probably cool and useful, but there's no way to know how
       | much of the API you're covering, right?
       | 
       | I find that the real shortage of tools exists going the other
       | way: from OpenAPI to code. The ecosystem has proven to be a huge
       | disappointment there, comprising a janky collection of defective
       | tools that still (years later) don't support version 3.1 (a
       | critical update).
        
         | ivan888 wrote:
         | I think going from code to OpenAPI makes a lot more sense, at
         | least for strong typed languages. And even if not directly
         | translated from code, at least closer to the actual code, in
         | annotations or something. Generating the spec from code removes
         | a step, where you simply need to update code, rather than
         | update the spec then update the code
        
       | spxneo wrote:
       | I think pairing this tool with something that recursively clicks
       | through app would be insanely helpful. (the latter is what I have
       | trouble finding)
        
         | adawg4 wrote:
         | Open to ideas! We're thinking of adding agents/crawler
         | suggestions to the github if there's a package that clicks
         | around in that fashion
        
           | samstave wrote:
           | Forgive the naive question, but to pair with the GP,
           | thoughts:
           | 
           | 1. Wouldnt this also be helpful in understanding the exact
           | nature of all traffic/calls against a particular page, user-
           | workflow matriculating through your site from a UX
           | perspective?
           | 
           | 2. Could one make a proxy from this on a local home egress
           | such that you could see the nature of outbound network
           | traffic to site you visit (more importantly, traffic heading
           | to 3rd-party trackers/cookies' APIs via your site visits?
           | 
           | 3. Could it be used to nefariously map open API endpoints
           | against a system one is (whiteHat) pen testing?
        
         | cipherself wrote:
         | One slightly related thing you can do is to test the API with
         | schemathesis[0]
         | 
         | [0] https://github.com/schemathesis/schemathesis
        
         | knicholes wrote:
         | This seems pretty simple to me to do. Search the html of the
         | main page for anchor tags. Add the links in those tags to an
         | array as your exploration frontier. Once done parsing that
         | html, load the next link. Add deduplication to avoid loops and
         | just run a depth-first search. What am I missing?
        
           | dns_snek wrote:
           | In many web apps there are going to be buttons and links that
           | are not represented as <a>. You would realistically have to
           | enumerate everything that has any kind of event handler
           | attached since it could potentially trigger an API call.
           | 
           | You would also have to fill and submit forms with valid and
           | invalid data. You would have to toggle checkboxes, change
           | radio buttons, click buttons, (e.g. "Apply filters" after
           | changing values in a product filter section), and generally
           | go through many combinations of inputs to find all valid
           | parameters and possible responses.
        
           | somethingAlex wrote:
           | For brochure / static content sites this is definitely the
           | beginnings of a web crawler but it can be a lot trickier for
           | web apps.
           | 
           | For example, clicking a link which loads some data, then
           | clicking edit (which isn't even an anchor), typing in &
           | clicking stuff, then clicking the save button (don't click
           | the cancel button!) would not be an interaction that would
           | get picked up with your suggestion. Detecting loops becomes
           | much more ambiguous and backtracking to get all the
           | permutations of interactions becomes a whole other problem to
           | solve.
        
       | hnrodey wrote:
       | I thought I recalled something similar to this posted in the
       | past.
       | 
       | https://news.ycombinator.com/item?id=38012032
       | 
       | (5 months ago)
        
         | adawg4 wrote:
         | Love what Andrew is doing! We built this for localhost when
         | testing vs. the client side with a Chrome extension.
        
       | nattaylor wrote:
       | Reminds me of https://github.com/alufers/mitmproxy2swagger which
       | I discovered from this thread
       | https://news.ycombinator.com/item?id=31354130
       | 
       | I generated some specs from that!
       | 
       | I ran into trouble keeping them up to date.
        
         | adawg4 wrote:
         | Super curious - How did you try keeping them up to date?
        
       | soneil wrote:
       | This looks interesting - it seems like it wouldn't be a huge lift
       | to turn this into something that compares against an existing
       | spec too. create an AutoSpec, take your defined spec, and spot
       | the difference.
        
       | remoquete wrote:
       | A nice tool for research, or for documenting third-party APIs.
       | Let's not forget, though, that one of the goals of OpenAPI is to
       | serve as a design and documentation artifact in design-first API
       | development; generating OpenAPI from code or, as in this case,
       | from network traffic, is an interesting complement and something
       | you can use to test the implementation against the design.
        
         | adawg4 wrote:
         | It makes sense, and we love API-first companies. How are
         | frameworks prioritizing this? Seen DRF and lite star but seemed
         | like this was needed to help at places we worked/API market
         | reports about companies that hadn't put those standards in yet
        
       | arscan wrote:
       | I have a similar need but for the FHIR[1] spec, which has its own
       | way of describing RESTful http endpoints that serve FHIR data.
       | 
       | I was looking into how this works for inspiration, and it seems
       | like the work of inferring the OpenAPI definition from recorded
       | requests/responses is handled by the har-to-openapi nodejs
       | library [2]. Is this by the same team? If not, kudos for
       | packaging this up in a proxy -- seems like a useful interface
       | into that library.
       | 
       | 1. https://www.hl7.org/fhir/
       | 
       | 2. https://github.com/jonluca/har-to-openapi
        
       | w3news wrote:
       | When you build an API, please start with the OpenAPI
       | specification, before you write any code for your API. It can be
       | iterative, but for every part, just start with the OpenAPI, and
       | think about what you want from the API, what do you want to send,
       | and what to receive.
       | 
       | It is like the TDD approach, design before build.
       | 
       | Writing or generating tests after you build the code, is the same
       | as this. It is guessing what it should do. The OpenAPI
       | specification, and the tests should tell you what it should do,
       | not the code.
       | 
       | If you have the specification, everyone (and also AI) can write
       | the code for you to make it work. But the specification is about
       | what you think it should do. That are the questions and
       | requirements that you have about the system.
        
         | paholg wrote:
         | I don't want to write OpenApi. Yaml is a terrible programming
         | language, and keeping it in sync with actual code is always a
         | nightmare.
         | 
         | I've been using a tool to generate OpenApi from code, and am
         | pretty happy with that workflow. Even if writing the API before
         | logic, I'd much rather write the types and endpoints in a real
         | programming language, and just have a `todo` in the body.
         | 
         | You can still write API-driven code without literally writing
         | OpenApi first.
        
           | rnts08 wrote:
           | Absolutely, and yes YAML is trash.
        
           | adawg4 wrote:
           | Which tool?
        
         | brosciencecode wrote:
         | I get the feeling you may not have gone 0-1 on an API before.
         | In general, you have 1 consumer when you're starting off, and
         | if you're lucky your API gathers more consumers over time.
         | 
         | In that initial implementation period, it's more time-consuming
         | to have to update a spec nobody uses. Maintaining specs
         | separately from your actual code is also a great way to get
         | into situations where your map != your territory.
         | 
         | I'd instead ask: support and use API frameworks that allow you
         | to automatically generate OpenAPI specs, or make a lot of noise
         | to get frameworks that don't support generating specs to
         | support that feature. Don't try to maintain OpenAPI specs
         | without automation :)
        
           | tevon wrote:
           | I agree with this as well!
           | 
           | OpenAPI spec seems intended to be consumed, not written. Its
           | a great way to convey what your API does, but is pretty awful
           | to write from scratch.
           | 
           | I do wish there was a simpler language to write in... JSON-
           | based as well that would allow this approach of writing the
           | spec first. But alas, there is not, and I have looked a
           | loooot. If anyone has suggestions for other spec languages
           | I'd love to learn!
        
         | yashap wrote:
         | I completely agree as a general design principle, but I still
         | think there's a place for the above tool.
         | 
         | Example: I used to work at a place that had a massive PHP
         | monolith, developed by hundreds of devs over the course of a
         | decade, and it was the worst pile of hacky spaghetti code I've
         | ever seen. Unsurprisingly, it had no API spec. We were later
         | doing tonnes of work to clean it up, which included plans to
         | add an API spec, and switch to a spec-first design process
         | (which we were already doing in services split from the
         | monolith), but there was a massive existing surface area to
         | spec out first. A tool like this would've been useful to get a
         | first draft of the API spec up and running quickly for this
         | huge legacy backend.
        
         | mdasen wrote:
         | As a curiosity, how do you feel about languages/frameworks
         | where APIs can be pretty self-documenting? For example,
         | Java/JAX-RS creates pretty self-documenting APIs:
         | @Path("/people")         public class PeopleApi {
         | @Path("{personId}")             @GET             public Person
         | getPerson(@PathParam("personId") int personId) {
         | return db.getPerson(personId);             }         }
         | 
         | It's easy to generate a spec for a JAX-RS class because it has
         | the paths, parameters, types, etc. right there. There's a GET
         | at /people/{personId} which returns a Person and takes a path
         | parameter personId which is an integer.
         | 
         | If we're talking about a Go handler which doesn't have that
         | information easily accessible, I understand wanting to start
         | with a spec:                   func GetPerson(w
         | http.ResponseWriter, r *http.Request) {             personId, _
         | := strconv.Atoi(r.URL.Path.something)             person :=
         | db.GetPerson(personId)
         | w.Write(json.marshal(person))         }              func
         | GetPerson(c echo.Context) error { //or with something like
         | Echo/Gin             id := c.Param("id")             person :=
         | db.GetPerson(id)             return c.Json(http.StatusOK,
         | person)         }
         | 
         | In Go's case, there's nothing which can tell me what the method
         | takes as input without being able to reason about the whole
         | method. With JAX-RS, it's easy reflect on the method signature
         | and see what it takes as input and what it gives back, but
         | that's not available with Go (with the Go tools that most
         | people are using).
         | 
         | This isn't meant as a Go/Java debate, but more a question of
         | whether some languages/frameworks basically already give you
         | the spec you need to the point where you can easily generate an
         | OpenAPI spec from the method definition. Part of that is that
         | the language has types and part of it is the way JAX-RS does
         | things such that things you're grabbing from the request become
         | method parameters before the method is called rather than the
         | method just taking a request object.
         | 
         | JAX-RS makes you define what you want to send and what you want
         | to receive in the method signature. I totally agree that people
         | should start with thinking about what they want from an API,
         | what to send, and what to receive. But is starting with OpenAPI
         | something that would be making up for languages/frameworks that
         | don't do development in that way naturally?
         | 
         | ----------
         | 
         | Just to show I'm not picking on Go, I'm pretty sure one could
         | create a Go framework more like this, I just haven't seen it:
         | type GetPersonRequest struct {             Request
         | `path:/people/{personId}`             PersonId int `param:path`
         | }         func GetPerson(personRequest GetPersonRequest) Person
         | {             return db.GetPerson(personRequest.PersonId)
         | }
         | 
         | I think you'd have to have the request object because Go can
         | annotate struct fields (with struct tags), but can't annotate
         | method parameters or functions (but I could be wrong). The
         | point is that most languages/frameworks don't have the spec
         | information in the code in an easy way to reflect on like JAX-
         | RS, ASP.NET APIs, and some others do.
        
         | tkiolp4 wrote:
         | What's wrong with designing an API by writing its code? Code
         | itself is a design tool (and usually any decent programming
         | language is a better design tool than YAML)
        
           | spondylosaurus wrote:
           | As someone who documents APIs: it's easy to tell which APIs
           | were designed with intention and which ones were designed on
           | the fly. In part because it's much, much easier to document
           | the former :)
        
       | sbeckeriv wrote:
       | I wrote a tool for work that does the same thing based on request
       | logs. It would parse each line into a structure then merge the
       | same call point structures down to one spec. It was helpful to
       | see the api but in the end was not that helpful in back filling
       | the openapi spec.
       | 
       | things to consider: - junk data from users will show up. unless
       | your downstream service rejects extra params users will mess with
       | you. - it documents each endpoint but its harder to say if this
       | "user" data is the same as another's endpoints "user" - it is
       | hard to tell if users are hitting all endpoint inputs/outputs
       | without manual review.
        
       ___________________________________________________________________
       (page generated 2024-03-25 23:00 UTC)