[HN Gopher] The Idempotency-Key HTTP header field
       ___________________________________________________________________
        
       The Idempotency-Key HTTP header field
        
       Author : detaro
       Score  : 248 points
       Date   : 2021-07-04 13:49 UTC (9 hours ago)
        
 (HTM) web link (datatracker.ietf.org)
 (TXT) w3m dump (datatracker.ietf.org)
        
       | gbolcer wrote:
       | Why don't they generate the key from the payload if it has to be
       | unique and request specific?
        
         | lanstin wrote:
         | You might have two identical requests, say for two identical
         | credit card purchases, or you might have faulty retry on the
         | same request.
        
         | hit8run wrote:
         | Maybe because the client should define what is the payload he
         | wants to treat as idempotent. A request with the same message
         | might have the same payload hash but it is sent on purpose
         | twice.
        
           | jmull wrote:
           | That's not it... this RFC requires that the server know how
           | to distinguish two requests with different payloads but the
           | same idempotency-key value (and return a 422 response in that
           | case).
           | 
           | Not to mention, it makes little sense for a server to be able
           | to implement idempotency without understanding the
           | idempotency semantics of the payloads it accepts.
        
             | brandur wrote:
             | Your parent is correct. Think of a simplified example of
             | posting a $1 charge to a customer.                   POST
             | /v1/charges `{amount: 100, currency: usd, customer: 123}`
             | 
             | You might do that, and then later charge them another $1:
             | POST /v1/charges `{amount: 100, currency: usd, customer:
             | 123}`
             | 
             | The request payload would be identical, so the API can't
             | use just that to distinguish requests. Adding an
             | `Idempotency-Key` lets you tell the server that these are
             | purposely two independent charges.
             | 
             | > _this RFC requires that the server know how to
             | distinguish two requests with different payloads but the
             | same idempotency-key value (and return a 422 response in
             | that case)._
             | 
             | This is an error checking facility to make sure that the
             | idempotency keys are being used correctly. Sending a
             | different request with the same idempotency key is non-
             | sensical, so the server tells its client about the problem.
        
         | vishnugupta wrote:
         | Because only the caller knows if two payloads, despite being
         | identical, are genuinely different requests or retries of the
         | same request.
         | 
         | That said, in this age of micro services and large-scale
         | distributed systems a request may get retried by a framework
         | layer down a call-chain regardless of whether the request
         | originator intended it or not.
        
       | 0xbadcafebee wrote:
       | It's so funny to watch REST get "patched" with headers rather
       | than just update the protocol. Path of least resistance I guess
        
       | kyrra wrote:
       | Googler, opinions are my own.
       | 
       | I work in payment and idempotency is always a discussion point.
       | This draft links to one of our docs (though the draft points to a
       | 404 URL, the correct link being[0]).
       | 
       | As the draft calls out, Google Standard Payments uses the
       | "requestId" field as our idempontency key, which is part of the
       | request body, rather than in the HTTP header. We did this
       | specifically to decouple the protocol from the behavior of the
       | request. We could take the existing definitions and put that
       | payload into any other protocol and it would work without having
       | to use that protocol's sidechannel data (aka, the HTTP header).
       | 
       | Our design makes idempontency part of the application layer,
       | rather than any middlebox or load balancer.
       | 
       | [0] https://developers.google.com/standard-
       | payments/guides/conne...
        
         | herodoturtle wrote:
         | Thanks for the insightful comment; makes good sense.
         | 
         | Could you kindly share your view on the security implications
         | of this approach - in particular with reference to the security
         | context outlined in section 5 of the IETF draft:
         | 
         | > For idempotent request handling, the resources MAY make use
         | of the value in the idempotency key to look up a cache or a
         | persistent store for duplicate requests matching the key. If
         | the resource does not validate the value of the idempotency key
         | prior to performing such a lookup, it MAY lead to various forms
         | of security attacks and compromise.
        
         | brandur wrote:
         | A nice feature of keeping the idempotency key separate from the
         | payload is that a service like Stripe can build tools to help
         | users with idempotency even if the user has no idea what an
         | idempotency key is.
         | 
         | For example, take a look at stripe-go's implementation, which
         | automatically tags a request with a key if the user didn't
         | specify one:
         | 
         | https://github.com/stripe/stripe-go/blob/67034d2205c0240ade9...
         | 
         | This works for all mutating requests, and is useful because the
         | built-in retry system will automatically reuse the same key
         | that was generated. Users can get the benefits of idempotency
         | without really having to understand very well what's going on
         | under the hood.
         | 
         | I suppose you could still do that by munging each request body,
         | but IMO it's a nice feature to make sure that requests are the
         | same as what the user specified. Also note that in practice the
         | implementations are probably not that wildly different under
         | the hood -- despite being in a header, Stripe's idempotency is
         | still being handled by the same application stack which
         | processes the payment (i.e. not a middle box or load balancer).
        
           | bbss wrote:
           | The downside with "automagically" trying to handle
           | idempotency is users may not be aware of it and retries may
           | happen across different processes (maybe they are running
           | their application on k8s with multiple pods), which doesn't
           | work with stripes default behaviour.
           | 
           | IMO the idempotency key should be required to be set and make
           | the user aware that they need to handle retries properly.
        
             | brandur wrote:
             | You might find differing philosophies depending on where
             | you look, but a recurring theme you'll find with Stripe's
             | is that they try to make it as easy as possible to get an
             | integration up and running. When you're building out a
             | payment integration, you're already awash in non-trivial
             | concepts that are probably new and novel to you, so things
             | that can be abstracted away for the time being to make
             | things easier generally are.
             | 
             | In the situation you describe, I think it would make more
             | sense to just retry the call a couple times from the same
             | pod so you don't have the sizable overhead of discarding it
             | and creating a new one for every failure, in which case the
             | automatic keys would work fine. And if there's a really
             | good reason they're not, setting the keys manually is
             | _very_ easy. At some point if you 're far enough off the
             | beaten path, you have to expect to read some docs.
        
         | hit8run wrote:
         | Interesting and I totally get your point. What protocols do you
         | support besides HTTP?
        
           | kyrra wrote:
           | There is the possibility of grpc, but there hasn't been a
           | case for it yet (hence why none of our docs even mention it).
           | Google also loves talking grpc/stubby internally, so any
           | Google internal application to application calls would use
           | that.
        
             | vettyvignesh wrote:
             | Pardon my ignorance but why does it matter that you're
             | using RPC/graphQL or REST based service, isn't the
             | underlying protocol still HTTP? So this spec would still be
             | applicable, right?
        
               | mcqueenjordan wrote:
               | The underlying protocol is not necessarily HTTP.
        
               | staticassertion wrote:
               | I think it is necessarily HTTP/2, but would be happy to
               | learn I'm mistaken - it's not something I'm overly
               | confident about.
               | 
               | https://developers.googleblog.com/2015/02/introducing-
               | grpc-n...
        
               | denysvitali wrote:
               | AFAIK, in Google's case the end application doesn't
               | really see the "GRPC HTTP headers" but instead they
               | convert an incoming HTTP request to one of their
               | frontends and route it to multiple backends (sending
               | effectively the HTTP body serialized via GRPC), and then
               | the frontends will simply re-create the response by
               | unifying them.
               | 
               | Not a Googler, this is what I understood from reading
               | similar things over multiple posts. Feel free to correct
               | me.
        
           | dorianmariefr wrote:
           | Probably all kinds of RPC
        
         | felixhuttmann wrote:
         | Strongly agree that idempotency should be understood as part of
         | the application, and not as part of the transport.
         | 
         | I find it insightful to think about a transport layer that is
         | much less reliable than the internet: traditional/snail mail.
         | In a proper business communication, every invoice has an
         | invoice number such that the recipient can know whether they
         | processed that invoice previously or not.
        
       | jmull wrote:
       | I don't get it.
       | 
       | For a particular request, a server must implement its documented
       | idempotency guarantee, and the client needs to understand the
       | server's guarantee to know whether it's safe to retry.
       | 
       | How does this header help or change anything?
       | 
       | It's only real use (that I see, please correct my
       | misunderstanding if there is one) is to paper over an incorrect
       | API design, where you want idempotent behavior but can have
       | identical payloads for distinct requests.
       | 
       | (e.g., imagine a chat app, where the message being sent is the
       | POST request body, and there's no other metadata about the
       | message. In that case, there's no way to distinguish a new "you
       | up?" message from a retry. However, such an API should simply
       | have a message id value sent with each message.)
        
       | est31 wrote:
       | It's interesting that this standardization effort hasn't happened
       | earlier, because it seems such an obvious feature addition to
       | POST requests. HTTP is decades old after all, and nobody thought
       | about this prior to the payment processors? I guess nobody really
       | cares if you have two posts in a forum with the same content, but
       | if you do two payments it's way worse. After all, double spend
       | prevention is what bitcoin does its hugely expensive proof of
       | work for.
        
       | tptacek wrote:
       | I'm not sure I understand why this needs to be standardized. Is
       | there something that middleboxes and CDNs need to do with
       | messages marked idempotent? Would a standard header enable
       | browsers to do something new that they're all likely to actually
       | do? Otherwise: HTTP already specifies a bag-of-attributes header
       | system, and this is just an application of it. Why is it in the
       | HTTP standards?
        
         | grishka wrote:
         | A real-world example from my previous job. You have a chat in
         | your mobile app. Messages are sent using HTTP requests, and
         | received using HTTP long polling. Now imagine sending a message
         | over a slow network while on a departing subway train (service
         | is only at stations). TCP & SSL are established successfully,
         | then the request itself is sent, but you leave the network
         | coverage before you receive the response. So you have no idea
         | whether the message was actually sent. When you reconnect at
         | the next station, you receive a message over long poll, but --
         | because the request failed as far as you're concerned -- you
         | have no idea whether it's the message you sent. To resolve
         | this, you need to have the ability to ask the server for a
         | response to a previous request, but without actually performing
         | that request a second time. Or, alternatively in this
         | particular case, you need the server to accept a unique client-
         | generated ID in the request and return it to you over long
         | poll.
        
           | quesera wrote:
           | That's one valid use of an idempotency token. Also e.g. POST
           | submission of an order or transaction.
           | 
           | But these are implementation-specific, and available (and in
           | active use) today under a few common HTTP header names.
           | 
           | It's not clear why the header name should be standardized.
           | 
           | You could move idempotency implementation up to the load
           | balancer or the proxy layer. I'm not convinced that would be
           | useful, because to implement the hard parts of idempotency
           | (esp across instances), you'd need to add a bunch of extra
           | complexity there, which is already a committed cost at the
           | application layer.
           | 
           | But if, say, haproxy or nginx release a new distributed
           | idempotency caching feature, maybe that would change my mind.
        
             | ec109685 wrote:
             | All that layer really needs to do is support caching
             | objects. When response is returned, cache based on
             | idempotent cache key, and on subsequent requests, lookup
             | that value before making downstream requests.
             | 
             | Also, should extend the key's value with an unguessable
             | session id, so folks can't can't guess the key and read
             | someone else's data.
        
               | quesera wrote:
               | Yep, but that idempotency caching layer would need to
               | share keys across all instances, possibly even
               | geographically.
               | 
               | This is a solved problem at the application layer of
               | course, but adding complexity at the higher (LB or proxy)
               | layer ... well, adds complexity there and the benefit is
               | not clear to me.
        
           | tptacek wrote:
           | I think you're misunderstanding my question. I understand why
           | idempotency tokens are valuable. What's not clear to me is
           | why HTTP would need an "official" idempotency token. The
           | behavior of APIs that run over HTTP is not properly part of
           | the HTTP specification.
        
             | aenis wrote:
             | Maybe to make it easier for people to understand the
             | concept - and reinforce a good design pattern? I
             | implemented this stuff my own way whenever I needed it
             | (e.g., for a chatbot), and every time i needed to take the
             | time to explain it to the client app dev teams. Sometimes
             | it took longer than desired, and pointing them to a well
             | written RFC would help.
        
               | tptacek wrote:
               | This is a little like saying there should be an RFC for
               | understanding and defending against SQL injection, isn't
               | it? Like, there are a lot of concepts that would be well-
               | served by an "official document", but that's not the
               | purpose of IETF RFCs.
        
         | quesera wrote:
         | Seems like a tail wag. The header is in common use in APIs in
         | the wild, so IETF wants to formalize it.
         | 
         | Not unreasonable, but doesn't seem to add much value either.
         | 
         | With standardization, load balancers or reverse proxies could
         | add a limited idempotency guarantee to reduce the
         | load/implementation cost on the application layer. I don't
         | think anyone is asking for this though, nor do I think they
         | should be...but maybe I'm wrong about that. "Free (quasi)
         | idempotency for small environments" could be a feature.
        
         | chalst wrote:
         | This is potentially of interest to any application making use
         | of POST. Its generality and simplicity justifies
         | standardisation.
        
           | tptacek wrote:
           | The concept is worth knowing about, but that doesn't mean it
           | belongs in the standard, especially since there's more than
           | one way to do it.
           | 
           | I'm wondering if someone knows a particular reason why this
           | way needed to be standardized.
        
       | unilynx wrote:
       | This (partly) solves some cases of the two generals problems,
       | where you didn't receive a response to eg. a resource creation
       | request but aren't sure whether that means that you can safely
       | retry it, or need to use other APIs to figure out the ID of the
       | created resource, if it exists
       | 
       | Some AWS APIs have it, eg
       | https://docs.aws.amazon.com/medialive/latest/apireference/in... -
       | I wish all their creation APIs did
        
       | shawnz wrote:
       | Seems like kind of a wordy name, why not just "nonce"?
        
         | SideburnsOfDoom wrote:
         | Not recommended to use "nonce" casually in the UK. Urban
         | Dictionary will tell you why.
        
           | shawnz wrote:
           | IMO that ship has sailed given the word has been extensively
           | used in software contexts for some time.
        
             | SideburnsOfDoom wrote:
             | It's still somewhat a specialised word, even in software
             | contexts. Or at least, UK people prefer not to to use it.
        
         | detaro wrote:
         | It probably makes some sense to reserve "nonce" for
         | security/cryptographic use.
        
         | topogios wrote:
         | I think nonces are generally thought of as numbers, and as a
         | client it might be simplest in some cases to be able to reuse
         | or derive from an existing non-number key.
        
           | chrismorgan wrote:
           | The HTML nonce attribute is a string:
           | https://developer.mozilla.org/en-
           | US/docs/Web/HTML/Global_att...
        
       | chrismorgan wrote:
       | I find the definition of idempotencyvalue in the grammar
       | surprising: it looks like RFC 7230 quoted-string, but is actually
       | just anything except for control characters, whitespace (!) and
       | quotes, surrounded by quotes.
       | 
       | Then I remember that the ETag header does the same thing since
       | RFC 7232, where it _used_ to use quoted-string with its escaping
       | behaviour in RFC 2616. (And so they recommend avoiding
       | backslashes because some recipients may mangle it.)
       | 
       | Anyone have any idea why they've gone this way (ETag and
       | Idempotency-Key), rather than using quoted-string?
       | 
       | I will also note that people are already using a header named
       | Idempotency-Key without quotes, e.g.
       | https://stripe.com/docs/api/idempotent_requests. ETag has the
       | "W/" indicator which requires some sort of disambiguation, but at
       | least at this time I don't feel Idempotency-Key needs quoting.
        
       | gigatexal wrote:
       | " Idempotence is the property of certain operations in
       | mathematics and computer science whereby they can be applied
       | multiple times without changing the result beyond the initial
       | application. It does not matter if the operation is called only
       | once, or 10s of times over. The result SHOULD be the same."
       | 
       | Does not the use of "SHOULD" instead of "MUST" make this dead on
       | arrival?
       | 
       | Edit: I see a number of others have the same thoughts.
        
       | zepto wrote:
       | "It does not matter if the operation is called only once, or 10s
       | of times over. The result SHOULD be the same."
       | 
       | The result MUST be the same.
        
         | jefftk wrote:
         | I think that's too strong: the response may have a component
         | that is is non-deterministic. The important thing is that no
         | matter how many times you call it the server state is the same.
        
           | sam0x17 wrote:
           | Yeah, it's more about there being no side effects after the
           | first time than asserting anything about the actual response
        
           | zepto wrote:
           | If you distinguish the 'response' from the result, then I
           | agree. Typically I think of the result as denoting the
           | effect, and not the response message.
        
           | bastawhiz wrote:
           | Agreed. The state since the original success may have been
           | mutated by another request since (or simply changed state as
           | a result of any other async process), and the response may
           | simply be a rendering of the latest server state.
           | 
           | The response of the server absolutely should not be specified
           | as being identical, since that's not how idempotent requests
           | work. At that point, you're really just specifying a caching
           | key for verbs that have side effects.
        
         | jhayward wrote:
         | > _The result MUST be the same._
         | 
         | Transient errors happen, as well as loss of state errors, and a
         | good protocol allows for that fact.
        
         | SvenL wrote:
         | I think this is due to side effect which are not "undoable",
         | like sending an e-mail. In case the timeout happen when the
         | response is send, an email might be already send. If the
         | request is send again another email is send. I think that's why
         | the RFC says "result" and not "state".
        
           | lazide wrote:
           | Then it is a fundamentally non-idempotent operation?
           | 
           | In the case if an email you could use the idempotency key in
           | the accounts outbox so you could detect if you've already
           | sent the message and then no-op if that is the case, and
           | you're idempotent.
           | 
           | The response must reflect something useful, which is probably
           | going to be different in those two cases.
        
             | skybrian wrote:
             | Yes, writing status to log would be a better example; you
             | might want to log retries.
        
         | Spivak wrote:
         | Request 1:                  Status: OK        Modified: True
         | 
         | Request: 2                  Status: OK        Modified: False
         | 
         | Request 3:                  Status: EAGAIN        Modified:
         | False
         | 
         | Request 4:                  Status: TOOFAST        Modified:
         | False
         | 
         | Request 5:                  Status: EACCESS        Description:
         | Your OAuth token has expired.        Modified: False
        
         | bhk wrote:
         | The context is a paragraph describing a mathematical concept,
         | not specifying the protocol. _Any_ standardese imperative
         | (SHOULD /MUST/...) is out of place here. It should read "The
         | result is the same."
        
       | vishnugupta wrote:
       | I'm gland for this much needed change. I hope it also makes it to
       | frameworks and libraries such as JBoss, ExpressJS.
       | 
       | Idempotency is simple to understand but tricky to implement in
       | practice because of a bunch of edge cases; handling of business
       | failures, response (should it be the first successful response?),
       | how to document the behaviour without confusing the developers
       | and so on. So developers either don't consider it at all or when
       | they do, they forget to consider all these aspect.
       | 
       | FWIW we, as a team, spent quite some time debating and going back
       | and forth; I'm happy with the final behaviour we settled on
       | though the documentation could have been better[1]
       | 
       | [1] https://developer.uber.com/docs/payments/glossary
        
       | pokoleo wrote:
       | It's a bit funny to see an IETF draft for this!
       | 
       | Adding idempotency support to Stripe's API was one of the things
       | I built while an intern in 2015. We've since rewritten it almost
       | entirely, but many the technical decisions made in 2015 still
       | stand.
       | 
       | One of the things that IETF doesn't specify ("an appropriate
       | response") is that Stripe's Idempotency API has a response body
       | has the latest rendering of the resource. Here's what I mean:
       | # Set tax exempt status       curl
       | /v1/customers/cus_AJ6m5vWl7scnn6 \         -H "Idempotency-Key:
       | aaaaaaaaaa" \         -d tax_exempt=reverse            - {id:
       | 'cus_AJ6m5vWl7scnn6', object: 'customer', tax_exempt: 'reverse',
       | ...}            # An unrelated request sets the customer's email
       | curl /v1/customers/cus_AJ6m5vWl7scnn6 \         -H "Idempotency-
       | Key: bbbbbbbbbb" \         -d email='foo@bar.com'            -
       | {id: 'cus_AJ6m5vWl7scnn6', object: 'customer', tax_exempt:
       | 'reverse', email: 'foo@bar.com', ...}            # Set tax exempt
       | status with the first idempotency key:       curl
       | /v1/customers/cus_AJ6m5vWl7scnn6 \         -H "Idempotency-Key:
       | aaaaaaaaaa" \         -d tax_exempt=reverse            # The
       | customer is re-rendered on the latest version       - {id:
       | 'cus_AJ6m5vWl7scnn6', object: 'customer', tax_exempt: 'reverse',
       | email: 'foo@bar.com', ...}
        
         | milofeynman wrote:
         | That's very useful! I can't decide if the architecture is more
         | complicated or simpler because of that. It helps avoid stale
         | data showing up in the UI, etc. It seems like you just need to
         | store the idempotency key and the request data (in case the
         | data changes, to catch bad requests using same key...), but not
         | the response.
         | 
         | I think Amazon actually stores the response in a table for some
         | amount of time (24 hours?), iirc
         | https://aws.amazon.com/builders-library/making-retries-safe-...
         | but I could be wrong.
        
         | felixhuttmann wrote:
         | What I have always wondered about, in the stripe docs it says
         | "Stripe's idempotency works by saving the resulting status code
         | and body of the first request made for any given idempotency
         | key, regardless of whether it succeeded or failed. Subsequent
         | requests with the same key return the same result, including
         | 500 errors." which indicates that the idempotency functionality
         | is created in a kind of layer around the main application
         | functionality, and thus the request from the idempotency layer
         | to the main app is not itself idempotent. So the idempotency
         | key protects against faults on the network between the client
         | and stripe's idempotency layer, but not against stripe-internal
         | faults between the idempotency layer and the application. Is
         | that the case? Why is the idempotency not achieved by using an
         | idempotent database transaction or atomic operation? What is
         | the value of responding repeatedly with a 500 if the original
         | 500 was caused by a transient error?
         | 
         | Adyen seems to similarly implement the idempotency in a
         | separate data store (https://docs.adyen.com/development-
         | resources/api-idempotency...).
         | 
         | Both stripe's and adyen's implementation seem to not treat
         | transient faults within their own systems correctly.
        
           | foolfoolz wrote:
           | these seem extreme for most apis but are likely very good at
           | preventing double charging. a "normal" api call usually costs
           | nothing. i bet stripe and adyen have apis with an average
           | cost per call (to someone) being like $20-$30
           | 
           | the cost to undo a charge is high in people and reputation
        
             | polynomial wrote:
             | How could their API possibly cost $20-$30 per call? How
             | could that even be a business model? Clearly, I am missing
             | something here.
        
               | foolfoolz wrote:
               | unrelated but fun fact: AWS CloudHSM v1 (deprecated now)
               | had a $5,000 api call. that was the cost to create a
               | cluster.
        
               | terinjokes wrote:
               | I suspect the OP meant to say charge instead of cost.
        
               | polynomial wrote:
               | ah that makes sense.
        
           | brandur wrote:
           | > _So the idempotency key protects against faults on the
           | network between the client and stripe 's idempotency layer,
           | but not against stripe-internal faults between the
           | idempotency layer and the application. Is that the case?_
           | 
           | Yes, in practice this is the case -- the idempotency key
           | insertion could succeed and then subsequent queries fail.
           | Practically though, it doesn't happen very often -- if a
           | request gets as far as the idempotency layer successfully,
           | the rest of it tends to work too.
           | 
           | Where it doesn't, Stripe pays a lot of attention to 500s,
           | especially where those pertain to charges, and a lot of time
           | and energy is spent cleaning up state that might have
           | resulted in an invalid transaction.
           | 
           | > _Why is the idempotency not achieved by using an idempotent
           | database transaction or atomic operation?_
           | 
           | The most honest answer is that Stripe wasn't built on a data
           | store where transactions are supported, so it wasn't even a
           | possibility until quite recently, and by then the system was
           | already well established in its current form.
           | 
           | Beyond that though, once your requests are making their own
           | requests to modify state in foreign systems (which is
           | happening at Stripe in the form of banks, partners, internal
           | systems, etc.), a single transaction isn't enough to keep
           | things entirely in order anymore because it can't roll back
           | that remote state. It is still possible to build a very
           | robust system that is transaction-based, but it becomes a
           | much more complex problem than a simple `BEGIN`/`COMMIT`.
        
             | wonton53 wrote:
             | > Practically though, it doesn't happen very often
             | 
             | It probably happens much more often than you think if you
             | say this as someone with an insider perspective. I have had
             | to integrate with banking systems and payment systems that
             | use this approach, and it is extremely frustrating and
             | comes of as a way of offloading work to the client. If a
             | payment capture succeedes but subsequently always returns
             | 500 an api client has to first query the status and then
             | execute if-then logic for something that could easily just
             | have been a retry of the request (3x the logic). This is
             | acceptable since there are a million other ways to mess up
             | idem potency so integrations kind of end up this way
             | anyway. But the worst part of such an approach is debugging
             | the issues as a client. A client cannot see the internal
             | logs and therefore have to call support which will ALWAYS
             | answer <<the transaction seem fine with us>> and basically
             | just close the case to protect their KPIs. Im pretty sure
             | this type of issue has a name but cannot find it (state
             | client see dont match the real state). I dont mean to
             | offend the work people put into these APIs, but I cannot
             | see the good qualities of this approach other than saving
             | development hours (and possibly saving one db query, but as
             | an effect you get a status request plus another capture
             | request as a workaround from your clients).
        
               | brandur wrote:
               | Just to clarify: I was speaking specifically about the
               | case where you have a series of DB calls (like: auth
               | user, retrieve account record, insert idempotency key, do
               | more stuff), and the first one succeeds and the next ones
               | fail. It can happen where the DB suddenly drops out as
               | the request is executing, but it's more likely that it's
               | either available or it isn't, so either the request
               | succeeds, or it can't start.
               | 
               | And this comment was just meant to talk about faults with
               | your own database. Once you are reaching out to other
               | systems you see all kinds of problems regularly, but
               | those tend to be handled more robustly because you kind
               | of have to.
        
             | felixhuttmann wrote:
             | Thank you for the insightful answer.
             | 
             | I see the problem with mutations in foreign systems if
             | those foreign systems do not support idempotency
             | themselves. IMHO, though, stripe should abstract away
             | faults in banks, and figure out how to work around faults
             | in bank's systems using e.g. automated refunds when a
             | duplicate charge is detected, and not just bubble up a 500
             | to stripe's customers and leave it to them to figure out.
             | If stripe cannot figure it out in an automated way toward
             | the bank whether the request suceeded, stripe's api
             | customers certainly can't either, and stripe should risk
             | double-charging the end customer knowing the the end
             | customer will complain and request a chargeback.
             | 
             | > The most honest answer is that Stripe wasn't built on a
             | data store where transactions are supported
             | 
             | Transactions are not necessary if one can do an insert
             | conditional on the key not yet existing, but then it is
             | required to have the idempotency key from the client enter
             | into the primary key.
        
               | brandur wrote:
               | > _IMHO, though, stripe should abstract away faults in
               | banks, and figure out how to work around faults in bank
               | 's systems using e.g. automated refunds when a duplicate
               | charge is detected, and not just bubble up a 500 to
               | stripe's customers and leave it to them to figure out._
               | 
               | Yeah, this is what Stripe tries to do. Most problems
               | during calls to foreign systems are handled in a ways
               | that try hard not to send back an internal error. 500s
               | are tracked carefully because they're painful to the
               | user, but also because they leave behind potentially bad
               | state that'll eventually cause problems internally and
               | externally.
               | 
               | If they can be reconciled, a webhook will be fired to
               | give the caller a more determinate answer (obviously less
               | convenient for them to handle, but at least some sort of
               | message makes its way back). More documentation on that
               | here:
               | 
               | https://stripe.com/docs/error-handling#server-errors
               | 
               | > _Transactions are not necessary if one can do an insert
               | conditional on the key not yet existing, but then it is
               | required to have the idempotency key from the client
               | enter into the primary key._
               | 
               | This is how the implementation works more or less -- an
               | insert on a unique index that will error on a duplicate
               | so you know it happened already. You'd probably implement
               | it similarly in basically any major database whether
               | Mongo, Postgres, MySQL, etc.
               | 
               | This is only a very small part of what transactions get
               | you though -- if a transaction-based request makes it
               | midway through its lifecycle and then fails, it can roll
               | back to a fresh slate. In a transaction-less system, you
               | need to come up with some other answer for what to do
               | what the partial state that was left behind.
        
           | joshribakoff wrote:
           | Many developers conflate idempotency and purity. It appears
           | stripe is trying for purity, two requests always have the
           | same response. This is likely misguided.
           | 
           | Idempotency would be like the API failed to generate the
           | success response, but the backend did issue the money
           | movement. On subsequent retries what matters is that it
           | doesn't do a duplicate money movement. The client ought to
           | get a request indicating the money was successfully moved
           | (since it was), perhaps with the earlier timestamp as a way
           | of indicating it was already moved. Or it ought to get a
           | special response that indicates the client can stop retrying
           | 
           | Sending the same error upon success sounds like the client
           | has no way to know when to stop retrying. Sure it makes the
           | API pure, but what problem does it even solve?
        
             | rob-olmos wrote:
             | For Stripe, there's two headers that can help clients with
             | retrying or not: "Idempotent-Replayed: true"[1] and
             | "Stripe-Should-Retry"[2].
             | 
             | Whether they're helpful in all server-side error situations
             | I'm not sure.
             | 
             | 1: https://stripe.com/docs/idempotency#sending-idempotency-
             | keys
             | 
             | 2: https://stripe.com/docs/error-handling#the-stripe-
             | should-ret...
        
         | slt2021 wrote:
         | are you mixing command and query together in one request? i
         | thought it is a bad practice.
         | 
         | mutating data is command and is a separate request and fetching
         | customer details is another request
        
           | chrismorgan wrote:
           | On the contrary, for remote APIs like this having commands
           | return the resulting state is the only sane path; otherwise
           | you get mandatory inefficiency (doubling latency) and race
           | conditions.
        
             | joshribakoff wrote:
             | CQRS doesn't obviate race conditions. Especially if
             | multiple commands can be in flight at the same time (the
             | response can still be received out of order), or if there
             | are multiple clients mutating the same resource.
        
               | chrismorgan wrote:
               | > _CQRS doesn't obviate race conditions._
               | 
               | I think you may be talking at cross purposes with me
               | here. I'm saying that command-query separation1 _causes_
               | race conditions. Having the mutating API return state is
               | the only way to obtain the value of the resource _at the
               | time of mutation_.
               | 
               | --
               | 
               | 1 CQS; CQRS, where you use different representations for
               | command and query, is not necessarily applicable here.
        
               | joshribakoff wrote:
               | Yes you're right. I meant to write that abstinence from
               | CQ[R]S doesn't prevent race conditions.
        
           | manigandham wrote:
           | Command and Query is really just named that in CQRS, which is
           | specifically a pattern that splits read/write into separate
           | traffic paths. It's not the only way to do things, and I find
           | it usually isn't that great in practicality.
        
           | joshribakoff wrote:
           | It's not bad practice. CQRS is just a pattern with pros and
           | cons.
        
       | vp8989 wrote:
       | It's more appropriate IMO to name it something like "Request-
       | Identifier". All the client should indicate is that if you get 2
       | or more POSTs with that same Id, you are free to no-op on any
       | after the 1st successfully handled one.
       | 
       | Idempotency is an emergent property of a system so it doesn't
       | make sense that a client would dictate the key that's used to
       | provide that behavior. So if the Idempotency-Key is quite likely
       | not to be the actual "Idempotency-Key" in any sufficiently
       | complex system, then don't name it that.
        
         | bhawks wrote:
         | Providing 'idempotency' in this case commonly means to be
         | robust against transient network partitions and timing delays.
         | This goal can be met efficiently with a timebound nonce.
         | (Indeed that is a common approach).
         | 
         | A request identifier is a much stronger and user visible
         | property to your API. Where an 'idempotency' key is used simply
         | to prevent replays, a request identifier is a property that a
         | backend needs to store, index and serve queries against for a
         | much longer period of time. Furthermore this data must be
         | client generated by definition which would require some
         | namespacing mechanism to keep the data model logically correct.
         | Alternatively, just give me a random number (as this rfc
         | suggests).
        
         | jrochkind1 wrote:
         | It matters that the client knows the server is offering
         | idempotency, because then the client can safely do things like
         | re-try a request if it didn't get confirmation -- at least I
         | think this makes that possible? It is suprising to me the
         | standard doesn't mention it though, so maybe I'm missing the
         | boat?
         | 
         | If a client just knows it provided a "request-id", but not that
         | the server provides idempotent semantics, the client can not
         | safely re-try a request in which confirmation was not
         | retrieved.
         | 
         | Or, you know how when you click the back button and see that
         | message "this was the result of a POST request, but we can't
         | actualy show you the response" -- because the response wasn't
         | cacheable/cached, and the client can't safely re-do the request
         | to get a new response, because the request was not idempotent.
         | 
         | The client knowning the request is idempotent avoids all sort
         | of inconvenient (and sometimes confusing) client/user-agent
         | behavior. For both ordinary human-readable HTML, and APIs
         | (although by being an HTTP header, I guess this standard is
         | only about API's as a use case?). Just knowing "we supplied a
         | request-id we don't know what the server plans to do with it"
         | is not sufficient. This is why HTTP distinguishes between
         | methods that are idempotent and not -- which just means methods
         | that it's, by the standard, the server's obligation to
         | implement as idempotent. This is a way of making any method
         | opt-in idempotent.
         | 
         | I think?
        
         | karmakaze wrote:
         | I want both. If there are two network requests for the same
         | action, I want to be able to trace them distinctly and also
         | know that they are doing one thing.
        
         | joshribakoff wrote:
         | Yes I have seen idempotency keys generated on the client that
         | were comprised of the date and amount of money used to make a
         | payment. Don't do this. Just use a randomly generated request
         | ID when the request hits the API. The idempotency key is
         | (usually) there to prevent things like an RPC or event (in the
         | backend) being retried causing duplicate side effects.
         | Presumably you still want users to be able to intentionally
         | submit two payments for the same amount on the same date (csrf
         | tokens are more appropriate to prevent accidental duplicate
         | submission)
        
       | jdnier wrote:
       | I see an interesting application for the Validity, Expiry,
       | Enforcement, Fingerprint sections.
       | 
       | Many web apps generate HTML forms but submit form values via an
       | API endpoint. With browser web tools and standard options like
       | "copy-as-curl", it's very easy to run a "replay attack", where an
       | adversary submits the form manually once, does a copy-as-curl (or
       | similar) on that POST, then replays the POST repeatedly, updating
       | particular values in the generated curl output for each
       | subsequent POST.
       | 
       | This gets around form protections like captcha and allows for
       | rapid abuse via the API that backs the form. Enforcing that an
       | idempotency key/token be generated on the server side or that it
       | match the submitted content or be valid for a short timespan
       | could make replay attacks less convenient.
        
         | zemnmez wrote:
         | This is not related to replay. This is just a rate limiting
         | issue. Throwing arbitrary auth mechanisms between the server
         | and the client like that just means they need to make more
         | requests per fulfilled request.
         | 
         | The client is authorised to make the request - it would only be
         | a replay attack if the client were not authorised to make the
         | request and that limitation was bypassed by capturing and then
         | replaying it.
         | 
         | if you can replay the same captcha challenge over and over your
         | captcha is broken.
        
         | DangitBobby wrote:
         | Is this protection not already baked into the CSRF token?
        
           | zemnmez wrote:
           | No, a CSRF token only introduces information that is not part
           | of the ambient authority of the HTTP client (i.e. an attacker
           | cannot implicitly use it) - it is not uncommon to have
           | deterministic or relatively constant CSRF tokens (e.g. the
           | double-send cookie method). An n-once would fulfil your
           | criteria but is importantly _not_ a valid CSRF token unless
           | it is also correlated or derived from something the attacker
           | can't know like a session; otherwise the attacker can use
           | their own n-once to submit a victim request.
        
           | Sayrus wrote:
           | That and also, most captcha I've used so far require a
           | server-side validation that is not replayable with Curl (as
           | the token has been used during the first submission).
        
             | jdnier wrote:
             | Yes, but captcha is on the form. I'm thinking more about a
             | single-page app that submits form data via an API. That API
             | call can generally be replayed.
        
         | an_ko wrote:
         | With that scheme, I imagine the server would need to maintain a
         | per-client timestamp of when a token was last generated for
         | them, so as not to generate them too often.
         | 
         | But at that point, why not use that timestamp directly, to
         | rate-limit each client's form submissions? What's the benefit
         | of also issuing tokens? (Do you mean to thwart copy-as-curl
         | script kiddies? Parsing a token out of a previous message to
         | add it to a new one is a pretty low bar.)
        
           | jdnier wrote:
           | > Do you mean to thwart copy-as-curl script kiddies?
           | 
           | Yes, that's what I had in mind.
        
       | hcarvalhoalves wrote:
       | > Let's say a client of an HTTP API wants to create (or update) a
       | resource using a "POST" method. Since "POST" is NOT an idempotent
       | method (...)
       | 
       | Use PUT instead? The deterministic key that would go in the
       | Header goes as part of the URI. I'm not seeing why standardize a
       | roundabout way of doing the same. Am I missing something?
        
         | p4bl0 wrote:
         | Yes, that was my thought too. The HTTP verb PUT is specifically
         | made for idempotency, why would HTTP need a header filed for
         | that?
        
           | bpicolo wrote:
           | Because the use case isn't necessarily modifying a resource.
           | Charging somebody for a product, for example, doesn't have
           | natural idempotency. They could buy the same item twice
        
             | [deleted]
        
         | ec109685 wrote:
         | The semantics don't quite work out. If you don't know the ID of
         | a transaction, for example, using PUT with a client derived key
         | wouldn't make sense given a subsequent GET would be made
         | against the transaction ID, not the idempotency key.
        
           | hcarvalhoalves wrote:
           | An alternative is for the client to request an ID, another is
           | for the server to respond the PUT with a 307 Temporary
           | Redirect to GET the resource, idempotency is transparent for
           | the client.
        
       ___________________________________________________________________
       (page generated 2021-07-04 23:00 UTC)