[HN Gopher] API versioning has no "right way" (2017)
___________________________________________________________________
API versioning has no "right way" (2017)
Author : edward
Score : 117 points
Date : 2021-04-26 06:44 UTC (16 hours ago)
(HTM) web link (apisyouwonthate.com)
(TXT) w3m dump (apisyouwonthate.com)
| ainar-g wrote:
| I think I've also seen resource versions being called "layers"
| instead of "versions". So in theory you could use versions for
| the whole API, including error codes and other conventions, and
| layers to actually evolve and change your resources. Either as:
| v1.example.com/l1/users
|
| Or: example.com/api/v1/l1/users
| yamrzou wrote:
| Versioning is indeed a hard topic, especially for data
| science/engineering projects in production.
|
| When you have a pipleine defined as a complex DAG of operations,
| you can't just version the entire thing, unless you have enough
| resources to re-compute from scratch with every change, which is
| wasteful. So then, you have to keep track of data dependencies
| and their versions if you would like to ensure reproducibility.
|
| Versioning code isn't enough when you have runtime parameters
| that affects output data, and you want to stay flexible by
| allowing experimenting and re-running computations with different
| parameters, to be able to iterate quickly. Which poses a lot of
| challenges.
|
| And there doesn't seem to be a framework that easily solves those
| issues out of the box. I'm closely watching Dagster
| (https://dagster.io), as they seem to be aware of those
| challenges (for example for versioning:
| https://docs.dagster.io/guides/dagster/memoization), but I didn't
| try it yet; it introduces a lot of concepts and has a steep
| learning curve.
| mumblemumble wrote:
| I've been playing with Dagster lately. Overall, I like it quite
| a bit, though I'm not sure it quite solves that problem, mostly
| just because it has too many different kinds of input.
|
| My new pet hypothesis is that the best way to solve this
| problem is to stop treating configuration as its own special
| thing. If it's just another kind of data that you can pass as
| input to a solid/operator/function/whatever, then configuration
| changing is treated exactly like any other kind of input data
| changing. Which is, I think, almost always the behavior I want.
| cletus wrote:
| I have to disagree somewhat: versioning objects or endpoints is
| never the right solution. I'd go so far as to say that versioning
| the whole API is simply the least bad option.
|
| What constitutes a major version? Simple: as soon as existing
| clients break it's a new major version. That means you can, for
| example, add new endpoints because that's not a breaking change.
| You simply can't remove or change any of the existing endpoints,
| fields or objects.
|
| It also means that a client can't mix-and-match what major
| versions to use ie you can't use /v1/customer and /v2/account.
| Why? Because clients will do that as a quick hack and you need to
| save them from themselves.
|
| So why not version endpoints or objects? Because of environment
| bifurcation. Let's say you have 2 API versions (v1 and v2). To
| verify your API you can test each independently. That's
| relatively easy.
|
| But imagine you have 10 endpoints and each is versioned
| separately. Now you have to test 2^10 possibilities. Clients will
| do stupid things like use a different version as a quick hack or
| even when they don't need to and you'll be debugging those things
| forever. Don't give your API users footguns.
| mumblemumble wrote:
| My thoughts exactly. There may be no "right way," but I'm
| pretty sure there's a least wrong way.
| UncleMeat wrote:
| > as soon as existing clients break it's a new major version
|
| How do you know when this happens? You don't have integration
| tests with all of your clients. What if they break because of
| some insane dependency on your internal behavior. Let's say
| that they rely on your API taking >100ms to respond in all
| cases and you optimize things so now it returns in 50ms most of
| the time. Is that a breaking change?
|
| Clients will always always always find a way to depend on every
| single observable behavior of your system. This makes
| everything a breaking change, given enough users.
| audron wrote:
| You gotta draw a line somewhere and I'd probably draw it in
| proximity to the documented behavior, and apply some
| superheated sanity to it while at it.
|
| Every change ever will upset somebody. And at the point where
| you have enough users you have better stuff to care about
| than some people that natural selection will get to sooner
| than later.
| mike_d wrote:
| > You don't have integration tests with all of your clients
|
| Why not? Publish client libraries and test them. If someone
| makes a third party library or client, make sure it is
| appropriately documented as unofficial.
| closeparen wrote:
| These 2^N problems are really under appreciated, especially
| with experiments and feature flags. We ship an app with tens of
| thousands of flags! And chide each other for hacky,
| irresponsible engineering if we ever neglect to multiply the
| number of configuration states by two when releasing a new
| change. What percentage of the configuration space is actually
| tested? Perhaps one in a billion.
| travisjungroth wrote:
| If you have tens of thousands of flags, there is no way you
| are anywhere close to testing 1 in a billion of your states.
| 300 binary flags gives you more states than there are atoms
| in the universe.
| vsareto wrote:
| Aren't you supposed to remove feature flag code after it's
| found to be stable? Some products block paid features with
| flags and those stay in, of course.
|
| Do others get actual dev time to clear that up, or is it just
| rolled up into any technical debt work?
| deckard1 wrote:
| > Do others get actual dev time to clear that up
|
| Depends on how much influence your PM has over your team
| and whether your team really cares about going back and
| cleaning it up or would they rather go and do resume-
| building features instead. I feel like we already know the
| answer to this.
|
| I've often felt that all PMs should be required to take a
| class on combinatorics before being allowed to add a single
| new feature flag.
|
| Of course, devs spend countless hours unit testing and
| going through QA and then some marketing dickbag can come
| along and stuff a random script tag into the website via
| Google Tag Manager and destroy the performance or even kill
| the site entirely.
|
| Yeah, I'm a little bit cynical these days.
| closeparen wrote:
| AFAIK there is automated cleanups for flags abandoned in
| the "off" position but "on" flags are forever, in case the
| feature ever needs to be turned off in the future.
|
| In theory this promotes reliability, but I think it's
| rarely understood what non-local effects there might be if
| a given flag is turned off, and the more time goes on, the
| less we understand it.
| ghoward wrote:
| I had some of the same thoughts reading the article, but I
| could not articulate them as well as you did.
|
| A good resource for the details about what constitutes a
| breaking change is Rich Hickey's talk about Clojure's Spec. [1]
|
| [1]: https://youtube.com/watch?v=oyLBGkS5ICk
| noalarmsplease wrote:
| I don't know if it is industry standard, but it would be good
| if a user session (ie. bearer token, session id, cookie) was
| valid only on the designated API version (no mixed usage among
| different versions using the same token). The user then would
| have to manage distinct tokens if he decided to use two
| versions of the same API simultaneously.
| thehappypm wrote:
| "If clients don't request a specific version, should they get the
| earliest supported version, or the latest?" (in reference to MIME
| versioning)
|
| This one feels like the "right" answer to me, doesn't it? If the
| header isn't supplied, reject and demand one!
| jopsen wrote:
| Yes, but if your service didn't do this initially then you're
| struck..
|
| Of course you can always migrate to URL versioning :)
| cryptonector wrote:
| You can also use headers for API version negotiation, for non-
| browser UAs anyways.
| throwaway823882 wrote:
| It doesn't really matter how you version. Just pick something and
| be consistent.
|
| You should plan your API to be backwards-compatible, plan major
| version changes about 2 years apart, and put the sunset time of
| your version in big bold red lettering on the front of your docs.
| That still may not stop clients from begging for longer support,
| so again, the more backwards-compatible you are, the less of a
| pain in the ass it is to abandon old versions. And for God's
| sake, have the developers consider the pain of customer migration
| when they develop the new versions, and provide _clear, feature-
| complete migration paths_ from the old version to the new.
|
| Plan on using a single _non-Apex_ hostname (and possibly region-
| specific) for all your APIs, and do not change it, because again,
| pain of support /migration. Then put the version number in the
| URL, because it will be much easier to run analytics on usage
| from web access logs, and it works better with a shared domain.
| With this scheme you can always have a reverse proxy redirect
| specific URI prefixes to completely different backends. For
| simplicity, make each new major API version a fork, and run it as
| a separate set of backend network services.
|
| _Please_ make all of your teams follow the same API conventions.
| Don 't make me write 5 different API calls 5 different ways
| because you had 5 teams that didn't work together. When I see
| APIs like this, I know the company's tech management are lazy
| morons, which means the rest of leadership probably is too.
|
| Being backwards-compatible is different from evolving API
| functionality. If you just evolve everything, eventually you will
| reach a high level of complexity for each call that requires
| massive flow charts to decipher. It's better to create entirely
| new functions that are simpler to reason about. But don't do this
| if you can wait for the next major version and consolidate
| functionality then.
|
| Make sure you follow good distributed computing practices, like
| giving clear error messages when your server is too busy and the
| client needs to enable an exponential backoff with jitter.
|
| Please don't use gRPC, for the love of God. It's not compatible
| with every other API out there that (today, anyway) uses HTTP.
| The fact that there are multiple "gRPC proxies" is proof enough
| that an incompatible interface is just a future pain in the ass.
| Project owners / managers / team leads: Don't let developers pick
| the technology just because they think it's cool or modern or the
| best practice, and definitely don't let them choose something
| they've never used in production before. Consider the actual cost
| of the thing long-term to your organization.
| nucleardog wrote:
| > Consider the actual cost of the thing long-term to your
| organization.
|
| A thousand times this. Not just in API versioning, but in
| everything.
|
| Every new thing you introduce is a thing you must first develop
| competency in, and then maintain competency in for the life of
| the project. This cost is paid during development, it's paid on
| every new hire onto that project, and it's paid on-going as you
| need to work with that tool and have your ops team deploy,
| manage, debug, and tune that tool (where applicable).
|
| Choose boring technology where possible if it's not directly
| related to your line of business.
| slver wrote:
| > It doesn't really matter how you version.
|
| While there are many right ways to version, there are also a
| few wrong ones. Like putting the version in a header.
| glckr wrote:
| Why is that wrong?
| ccouzens wrote:
| If the different versions have different parameters but
| share a path you won't be able to document them using
| openAPI
| zerkten wrote:
| It's in the comment that the person was replying to:
|
| "...it will be much easier to run analytics on usage from
| web access logs..."
|
| Headers don't normally get logged the way that URLs do.
| Having easy and cheap analytics is very helpful in making
| the right calls. "Expensive" analytics is really painful
| with APIs because some people rightfully get scared about
| the impact of making changes while others will just plow
| ahead.
| jayd16 wrote:
| Whats specifically wrong with that? Caching issues?
| ProblemFactory wrote:
| Putting the version in a header looks "elegant" at first,
| as it keeps one URL for one resource. But in reality it
| adds accidental annoyance with no benefits over having the
| version somewhere in the URL.
|
| * Have to set up response headers and caching carefully to
| make sure different versions are cached separately.
|
| * A bit of extra complexity to set up load balancing.
|
| * A bit of extra complexity to set up web framework routing
| to controllers.
|
| * A bit of extra complexity for logging to track which
| endpoints were called.
|
| * Calling the URL without specifying a version gets you
| some undefined version...
|
| * ... or if you require a version header, can't preview GET
| endpoints in a standard browser.
|
| * If different parts of the API have different latest
| versions, you can't encode it in an URL and therefore can't
| return URLs for linking between resources.
|
| * On major version changes you might be removing, renaming
| or moving URLs, so why keep them pure and versionless in
| the first place.
| ben509 wrote:
| The wrongest is when it's stuck in an irrelevant header, as
| in Accept: application/json;v=3
|
| I'm sure that's specified somewhere, but it's a shamefully
| stupid specification, so don't follow it. Any sane person
| reading that is going to think you're accepting a specific
| version of JSON.
|
| Also, if you _must_ put versions be in a header, STOP with
| the nonsense of pretending things don 't exist. Give us some
| freaking clue, return a 404 and explain that we're missing
| your stupid header, or that the planets haven't aligned, or
| whatever we're supposed to fix.
| deckard1 wrote:
| I've worked at a company that probably hit every single one of
| these.
|
| Don't even get me started on microservices. If your company
| name isn't Amazon or Google, then what are you even doing? Just
| don't. One company I worked at did microservices. They never
| had enough headcount to actually do things properly. Every
| single service was this half-assed API that followed the
| current trend-of-the-day. Looking at the microservices was like
| looking at the rings of a tree. You could tell when certain
| people joined or left a company or when they decided to
| experiment with Clojure or Elixir or GraphQL, etc. Each service
| required a different header incantation to get it to work
| properly. You couldn't just curl a URL because that would be
| way too sensible. I wrote my own curl wrapper just to deal with
| this bullshit.
|
| And of course the biggest problem with microservices is that
| documentation is horrid. There is no incentive to maintain good
| documentation because the APIs are internal. I've spent many
| nights staring at the internals of various microservices to
| decipher what they are doing. It's not even my code or my
| department. But I'm stuck sifting through other people's
| messes.
|
| Oh, and it's really really cute that developers still believe
| in the mechanical documentation fairy that extracts
| documentation from the code with some geewhiz tool. No,
| GraphiQL is not documentation. Stop lying to yourself and each
| other.
|
| > Please don't use gRPC
|
| gRPC is another bandwagon. The gRPC proxies must be made out of
| paper mache with how quickly they crumble under the slightest
| load.
| ChrisArchitect wrote:
| bah
|
| all the different urls for these old posts.
|
| https://medium.com/apis-you-wont-hate/api-versioning-has-no-...
|
| https://blog.apisyouwonthate.com/api-versioning-has-no-right...
| nofunsir wrote:
| Benign app: MANDATORY UPDATE!!! YOU MUST UPDATE YOUR APP NOW!
| THERE'S NO CANCEL BUTTON BECAUSE BAD THINGS WILL HAPPEN IF YOU
| DON'T IMMEDIATELY UPDATE.
|
| Release notes: Bug fixes and performance improvements. Teehee.
| lbriner wrote:
| As others have said below and my own experience of running
| production APIs, the important thing is that you consider
| deprecation before you have deployed your first API so that you
| know how you will eventually re-version things.
|
| Individual versioning of endpoints is hell on earth so no dice
| there. Future versions might be completely differently shaped,
| hosted on different platforms, written in different languages
| etc. These changes might be for good reasons but will be a pain
| for your customers (depending on whether you APIs are 1st-class
| netizens or some back-channel/low traffic thing), so setting the
| tone and habits early on, including considering how you expect
| customers to migrate to different requests/responses etc. saves
| you the awkward question later.
|
| You can't support version 1 forever in most cases.
| hn_throwaway_99 wrote:
| I liked this article, but after trying many different approaches
| in the past, I've found that GraphQL's "evolvable" approach is by
| far the best. And while the author correctly points out that some
| REST APIs have used this approach for a while, there are a couple
| important things that make this approach really easy to use in
| GraphQL:
|
| 1. When you deprecate fields, they disappear from the "default"
| documentation in the Graphiql browser. Which means it's always
| easy to see the latest "version" of the API.
|
| 2. Since _clients_ request fields, you don 't feel like you're
| unnecessarily returning an object that's twice as big as it needs
| to be with all the new and deprecated fields in the same
| response.
|
| 3. Corollary to #2, but since clients request fields it's easy to
| detect who hasn't migrated and who is using deprecated fields,
| which makes it easier to notify these clients or at least make an
| informed decision about when you can remove deprecated fields.
| slver wrote:
| GraphQL's evolvable approach will likely not work that well on
| the "mutation" (i.e. command) side, because while you can map
| read-only data and remap and remap until it's beaten into
| submission, commands tend to have semantic meaning. When the
| backend changes, many commands may need to change at once.
|
| Just something to consider. Otherwise I find GraphQL's approach
| indeed great.
| gbourne wrote:
| It is a good write up of the options, and sad to say there is no
| "right" answer.
|
| That being said, I think early companies (like mine with an API)
| need to go the "Migrations a'la Stripe" mode. Basically you take
| the brunt of translating old request to new ones. We've done this
| several times and update our docs with the new calls/parameters
| so new users start using the latest version. Current users have
| no breakage (you hope) and new user are on the latest API
| version. Old users are also encouraged to move to the latest
| version since only it contains new functionality.
|
| It works well...however, the downside it you now have a lot of
| translations in your code and some users are on the new and some
| on the old. This means you eventually have a bit of a mess on
| your hands. The way we plan on mitigating this is tracking the
| translations and over time informing our users of the
| deprecation.
|
| Larger companies with dozens or hundreds of hands in the code, or
| with legacy code, likely can't use this technique. Perhaps why
| Facebook over time move to the /v1/ technique.
| derefr wrote:
| > informing users of the deprecation
|
| If you can get users to take notice from the start of the fact
| that your API has the possibility of spitting out certain
| 'temporary errors' that "MUST" trigger a client-side retry with
| backoff (e.g. 429 errors) -- and your API does actually emit
| these errors sometimes, such that clients' codebases are very
| likely to have this error-handling code in place -- then you're
| in a much better situation here: you can give deprecations like
| these technical force, by making deprecated APIs begin to
| randomly emit spurious failures, with the failure self-
| documenting with a response error message like "usage of this
| API has been deprecated and will _gradually_ cease to function.
| [link to blog post about transitioning to new API]"
|
| Start off with allowing 99% of requests through; then lower it
| following a sigmoid, e.g. 95%; 90%; 66%; etc; until eventually
| it's at some low number like 1%. Wait a few months at 1%, and
| _then_ turn it off. (And, obviously, email every user who your
| metrics say are still using the deprecated API, each time you
| ratchet it down, to nudge them once again to update their
| code.)
|
| Any app developer who doesn't notice that your API is now
| requiring ~100 retries to contact successfully, either isn't
| using the results for anything; or just plain isn't around any
| more to update their app, such that the app is now abandonware.
| Either way, you're likely safe to shut off the API at that
| point, and finally clean up that code.
|
| Of course, you can also make side-deals with any big enterprise
| user who needs more time, putting their API keys on a whitelist
| so that they'll get 100% success from the API until the very
| end. (Try not to allow them to slip the sunset date for the
| API, though; that would force you to keep the code around
| longer, which is what you're trying to avoid!)
| cpeterso wrote:
| Or gradually increase the API latency so the system still
| produces correct results but the users are increasingly
| motivated to upgrade to the new API.
|
| I like the idea of encouraging clients to handle API errors
| more robustly, but the type of client developers that are
| slow to upgrade to the new API might also write sloppy code.
| Their client will break but they're likely to blame your
| service.
| tomschlick wrote:
| The key to doing migrations on the request is having the
| migrations live in single classes/files. It makes it very clean
| to have the app migrate from version 3 to version 12 with the 9
| or so classes handing the response from one to the other.
|
| I made a package a few years ago in PHP/Laravel that focuses on
| this: https://github.com/tomschlick/request-migrations
| derefr wrote:
| Or doing the migrations as load-balancer-side request
| rewrites. (Nginx is particularly good for this use-case; it's
| what sets it apart from simpler LBs like HAProxy.) Then the
| cruft of old versions doesn't have to live in your codebase
| at all, but can live in the same infrastructure-config repo
| that holds your backend-service-unifying route-map; your SSL
| config; your rate-limiting setup; etc.
| tomschlick wrote:
| My main concern there would be the management and
| orchestration of the releases for that since its detached
| from the codebase that is changing. It sounds very
| performant though. Have any examples of this setup?
| derefr wrote:
| I don't, but that's because I keep my own infrastructure
| config living inside the same repo the code lives in.
|
| We do k8s GitOps with [a simulacra of] Google Kubernetes
| Engine's "Application Delivery"
| (https://cloud.google.com/kubernetes-
| engine/docs/concepts/add...). In this approach, you keep
| all your k8s manifests relating to the app in the app's
| repo, and then the app is "released" by running a command
| that does the following:
|
| 1. tags a particular commit, which locks down the release
| as being of a particular codebase + a particular target
| converged k8s state;
|
| 2. generates a Docker image, tags it with the commit tag,
| and pushes it;
|
| 3. makes a copy of the manifests;
|
| 4. burns the build-image's SHA into the copy of the
| manifests;
|
| 5. compiles the source manifests (which are using
| Kustomize) into a single static manifest, which is a
| complete definition of the new target converged cluster
| state for the app's k8s namespace;
|
| 6. commits that static manifest to a separate tagged
| "deployment" repo.
|
| A separate "deploy" command is then used to reach out to
| a cluster-side converger component and tells it to pull a
| particular commit from the "deployment" repo and converge
| to it.
|
| -----
|
| We do have subcomponents that live outside this repo,
| though; we manage them by:
|
| 1. running a "release" within the subcomponent repo
| (which does create a git tag + push a build-image tagged
| with that tag; but _doesn't_ generate /commit any k8s
| manifests, since the subcomponent repos don't have any
| k8s config of their own); and then
|
| 2. manually (for now) taking the git tag of the built
| image, and updating the main app repo's k8s per-env
| Kustomization.yaml file with it, i.e. updating a stanza
| like this with a new value for "newTag":
| images: - name: gcr.io/our-project/subcomponent
| newTag: v20210409094011
|
| In theory, our main component could also be tracked as a
| subcomponent in this manner, such that all our infra
| config would live in its own basically-empty "app" repo;
| but that would lose one of the main advantages of GitOps,
| which is being able to see from the git log exactly what
| went into a deployed release, both in terms of build-
| image and infra-config. Our subcomponent services don't
| evolve nearly as quickly as our main app does, so we
| don't lose much in the way of release comprehensibility
| by having them "symbolically linked" to the release like
| this.
|
| As it is, though, our app's LB config for api.example.com
| lives as a k8s Ingress manifest at
| /.appctl/config/base/unified-api/ingress.yaml within our
| app repo. (We're using ingress-nginx, so there's a pretty
| direct 1:1 mapping between this manifest and an Nginx
| server{} block.)
| gregmac wrote:
| I've done this in a codebase as well, and it worked quite
| well.
|
| The "main" code was always updated to be the latest stuff,
| and the old methods were moved to a class named like
| `WhateverApi_v1_0` (based on the last version it was
| available in). A routing engine picked up the proper
| controller based on the requested API version.
|
| The other thing we had to help make this easier was automated
| API "shape" tests. These are per-version tests that basically
| just call every API method and check that the response
| matches a specific schema. For every release, we made a new
| directory containing a copy of all the tests from the last
| version, and then never touched any of the old directories.
| If an API ever changed in a backwards-incompatible way, one
| of these tests would catch it.
|
| All this was relatively painless to maintain, and we also
| never had any API regression issues over dozens of releases
| spanning years. I'll note we did sometimes "cheat" and add
| new properties to an existing model (without making the
| backwards-compatible controller stuff), but this doesn't
| break any consumers because of the nature of JSON (at least
| we never had anyone complain about it, and I'm not aware of
| any language where that would happen).
| tomschlick wrote:
| Yup thats exactly what the package I linked does as well
| and it has worked without a hitch on the project I have
| implemented it on. All you have to do to test a specific
| version is send that header in the integration test and its
| good to go.
| k__ wrote:
| Talk to your users.
|
| It's nice if you follow some strict rules, but even if your
| customers violate them, you could still be inclined to help them.
| After all, they are the source of your income.
| simonw wrote:
| Something that does help a little bit here is requiring clients
| to specify the fields that they need (as seen in GraphQL) rather
| than doing the equivalent of "select *" and giving them back
| everything.
|
| This is useful because it lets you turn on detailed logging in
| order to understand exactly what fields are being used by which
| clients (identified by their API key or similar).
|
| If you want to make a backwards compatibility breaking change
| like removing a field you now at least have a way forward:
| announce the field is going away, then watch your logs to see if
| it's still being used and who is using it. Then you can actively
| reach out to clients that use the deprecated feature and
| eventually make an informed decision about the overall impact
| when you finally remove it.
| travisjungroth wrote:
| An interesting talk about versioning by Rich Hickey:
| https://www.youtube.com/watch?v=oyLBGkS5ICk
| [deleted]
___________________________________________________________________
(page generated 2021-04-26 23:01 UTC)