[HN Gopher] Smashing the state machine: the true potential of we...
___________________________________________________________________
Smashing the state machine: the true potential of web race
conditions
Author : chatmasta
Score : 103 points
Date : 2023-09-19 21:15 UTC (1 days ago)
(HTM) web link (portswigger.net)
(TXT) w3m dump (portswigger.net)
| josephg wrote:
| The core problem here is that web application developers don't
| understand or use database transactions. I've seen this myself at
| almost every web shop I've ever worked at - queries / updates
| would be made to the database in series - "fetch this, then
| update that, then update this other thing". Your code works fine
| locally, but when multiple sessions interact with that data all
| at the same time, who knows what will happen.
|
| I hadn't thought of using these race conditions to trigger
| security vulnerabilities. That makes all sorts of sense, and I'm
| going to use this work to justify my persnicketyness in future
| projects.
|
| It doesn't help that popular ORMs either don't expose
| transactions at all or they consider transaction handling to be
| an advanced feature. It looks like mongodb only (finally) added
| transactions in mongo 4.2.
|
| The general rule for how to avoid this problems isn't locking.
| Its using a database transaction per HTTP request, or "operation"
| - however you define that. Transactions are perfect because if
| any part of the request fails you can rollback the entire
| transaction. And if you really need to, you can emulate something
| similar yourself by having a "last modified" nonce on your record
| (a random number will do). If you ever do a read-write cycle -
| say from a web frontend - then the read should include reading
| the nonce, and the write should fail on the server if the nonce
| has changed since whatever value you previously read. (And you
| can quietly fix this by simply retrying the entire operation).
|
| From a security point of view, this is a fantastic find. I
| anticipate a massive amount of code out there is at best buggy,
| and at worst vulnerable to these sort of attacks.
| mickeyp wrote:
| > The core problem here is that web application developers
| don't understand or use database transactions. I've seen this
| myself at almost every web shop I've ever worked at - queries /
| updates would be made to the database in series - "fetch this,
| then update that, then update this other thing". Your code
| works fine locally, but when multiple sessions interact with
| that data all at the same time, who knows what will happen.
|
| I mean, you're right, and it's a bugbear of mine, too. But
| let's not pretend that SELECT FOR UPDATE and out-of-DB
| manipulation of said data fixes everything. (Nor that a direct
| UPDATE statement would.)
|
| Ultimately, either your updates are linearisable or they are
| not. Changing someone's first name with or without this
| methodology does not alter the fact that "last one to change
| wins". Which change is the right one is in this case in the eye
| of the beholder and the domain you work in.
|
| But, yes, updates that build on the existing values to derive
| new ones are common data races because people, indeed, do not
| understand databases, transactions or isolation levels. There
| are easy fixes for this type of problem.
|
| But it's never quite as simple as making everything
| SERIALIZABLE and FOR UPDATE and expecting that to resolve
| version conflicts because two different things have different
| expectations of what a row value should be.
| dontlaugh wrote:
| SELECT FOR UPDATE is a much better default, though. Updates
| end up ordered by which did the update first, which tends to
| match intuition.
|
| Of course the callers may be surprised at being out-raced, so
| you still need to deal with that. Often it's enough to return
| the updated value.
| josephg wrote:
| > But it's never quite as simple as making everything
| SERIALIZABLE and FOR UPDATE and expecting that to resolve
| version conflicts because two different things have different
| expectations of what a row value should be.
|
| Can you give some examples where making everything
| serializable is a poor choice?
|
| I think most of the vulnerabilities and bugs happen because
| people don't think about isolation / atomicity _at all_. And
| to me, SERIALIZABLE is probably the right default for 99% of
| regular websites. The example that comes to mind for me is
| realtime collaborative editing, though you can build that on
| top of serializable database transactions just fine. In what
| other situations do you need a different strategy?
| tzs wrote:
| Are there are any relational databases, or add-ons or frontends
| for relational databases, that allow for transaction sharing?
|
| What I mean by "transaction sharing" is for separate
| connections to be simultaneously using the same transaction.
|
| Consider something whose logic is like this:
| begin_transaction() select_something()
| update_something() compute_something()
| update_some_more() commit_transaction()
|
| where compute_something() includes calling a service running in
| another process or on another server to return some data, and
| that service uses the same database, and we want it to see the
| data that update_something() updated.
|
| With transaction sharing it might instead work something like
| this: tid = begin_shared_transaction()
| select_something() update_something()
| compute_something(tid) update_some_more()
| commit_transaction()
|
| and in the service that compute_something(tid) calls it could
| take tid as a parameter and it would do something like
| open_shared_transaction(tid) ...
| close_transaction()
| rockostrich wrote:
| I mean, it's possible to build something like that but why
| would you ever want to do that? It's so much simpler to just
| use a transaction per request and let the database's
| transaction logic handle which requests succeed and which
| ones fail. Clients should have retry logic to handle the
| failure scenarios. It's a lot simpler to reason about than
| trying to turn a distributed system into a single threaded
| one.
| josephg wrote:
| I presume the idea is to make microservices able to work
| with database transactions. Microservices: Proof that not
| all ideas can be winners.
| rockostrich wrote:
| Not really sure what the point is here. Microservices
| work great with database transactions.
| dontlaugh wrote:
| No, they don't. You can have a function in a library, to
| which you pass a transaction. You can't pass an open
| transaction to another service, in the general case. The
| service boundary forces multiple separate transactions.
| josephg wrote:
| I'm not sure if many databases offer this, but you can do it
| in foundationdb like this: version =
| txn.get_read_version().wait()
| compute_something(version)
|
| Then on the remote computer, you can create a new transaction
| and call: def compute_something(version):
| txn2 = db.create_transaction()
| txn2.set_read_version(version) # ... Then use the
| transaction as normal.
|
| This will ensure both transactions are looking at the same
| snapshot of the data in the database.
|
| Docs: https://apple.github.io/foundationdb/api-
| python.html#version...
| flir wrote:
| Hmm. I'd solve this by marshalling all the data in the parent
| process, and passing that, rather than the tid, to the
| compute process.
|
| Maybe you could do it with a database proxy. The whole
| approach feels unclean, though.
| johnnyworker wrote:
| Thanks for that! It seems trivial to enhance a DB wrapper to
| begin a transaction on the first query, and commit when the
| program finished without error. Like honestly, unless I totally
| missed something, why is that not built into everything it
| could be built into? Or put differently, is there ever a good
| reason to _not_ use transactions?
| josephg wrote:
| To be clear, you want to begin a transaction at the start of
| each http request and submit the transaction just before
| sending the response. (And error or retry if the transaction
| fails, depending on how your database works).
|
| But generally I agree. This pattern should generally be built
| into everything.
|
| Whenever I'm setting up an application server I usually whip
| up some middleware which takes care of this for me. The code
| is database specific - the semantics can differ slightly
| depending on what database you're using. And it only matters
| for requests which issue multiple commands to the database
| before returning. (But this happens all the time in many
| applications. Especially if you're using an ORM.)
| robocat wrote:
| I presume that a DB lock for the length of the request
| processing time would cause contention when the number of
| concurrent session requests increases (with user growth).
|
| The article mentions this example: Some
| data structures aggressively tackle concurrency issues by
| using locking to only allow a single worker to access them
| at a time. One example of this is PHP's native session
| handler - if you send PHP two requests in the same session
| at the same time, they get processed sequentially! This
| approach is secure against session-based race conditions
| but it's terrible for performance, and quite rare as a
| result.
| ishjoh wrote:
| Great article and I had no idea what to call 'time of check, time
| of use' (TOCTOU) which is a great name instead of me having to
| describe the situation.
|
| I'm building an app on top of django where I have to worry about
| this, if you're using django check out there support for select-
| for-update, and if you're database supports it nowait=True can be
| a great thing that will fail a read if a select for update is
| already run:
|
| https://docs.djangoproject.com/en/4.2/ref/models/querysets/#...
|
| Also worth mentioning optimistic locking if you're looking to
| solve the issue in a different way, there is more involved from
| the application side but it has some advantages as well. I tend
| to prefer select for update with nowait=True since it's simpler
| on the application side, but I have used optimistic locking in
| the past with great success and some systems support it OOTB.
| Here is a description from AWS for those curious:
|
| https://docs.aws.amazon.com/amazondynamodb/latest/developerg...
| 1vuio0pswjnm7 wrote:
| HTTP/2 is nightmarish.
| josephg wrote:
| The problem isn't HTTP/2. HTTP/1.1 is also vulnerable to these
| attacks; they're just harder to find, reproduce and ultimately
| fix.
| [deleted]
| rewmie wrote:
| > HTTP/2 is nightmarish.
|
| Why do you think that?
| robocat wrote:
| From article: Finally, use the single-packet
| attack (or last-byte sync if HTTP/2 isn't supported) to issue
| all the requests at once. You can do this in Turbo Intruder
| using the single-packet-attack template, or in Repeater using
| the 'Send group in parallel' option.
| rob-olmos wrote:
| Post on ACIDRain as an example of the ecomm time-gap issue back
| in the day: https://news.ycombinator.com/item?id=20027532
| imetatroll wrote:
| What I love about this is that http/2 which is non-trivial
| compared to http/1.1 allows this to happen. The same http/2 that,
| in my opinion, takes away from the beauty of the web.
| albinowax_ wrote:
| At least HTTP/2 pretty much kills request smuggling (assuming
| there's no downgrading behind the scenes)
| socketcluster wrote:
| This is an excellent article.
|
| >> Every pentester knows that multi-step sequences are a hotbed
| for vulnerabilities, but with race conditions, everything is
| multi-step.
|
| This is something that we spent a lot of time thinking about and
| why we decided to upgrade SocketCluster (an open source WebSocket
| RPC + pub/sub system https://socketcluster.io/) to support async
| iterables (with for-await-of loops) as first-class citizens to
| consume messages/requests instead of callback listeners.
|
| Listener callbacks are inherently concurrent and, therefore,
| prone to vulnerabilities as adroitly described in this article.
| It's very difficult to enforce that certain actions are processed
| in-order using callbacks and the resulting code is typically
| anything but succinct...
|
| Some users have still not upgraded to the latest SC version
| because it's just so different from what they're used to but
| articles like this help to confirm our own observations and
| reinforce that it may be a good decision in the long term.
|
| For all of its benefits, though, one of the gotchas of a queue-
| based system to be aware of is the potential for backpressure to
| build up. In our case, we had to expose an API for backpressure
| monitoring/management.
| aidos wrote:
| Hadn't thought about the possibility of using http/2 in order to
| generate race conditions. Nice. Interesting trick.
|
| Years ago, Stripe had a capture the flag related to distributed
| systems (I believe it was the second one they ran). One of the
| final levels involved attacking a system that would make calls to
| 3 other systems in turn that each held part of the key before
| calling back to your webhook. You could tell how many systems it
| had spoken to by watching for incremented port numbers between
| the requests.
|
| I got killed by the jitter, but I seem to recall someone saying
| there was a way to stream your requests at a lower lower level on
| the same socket. Caveat, 10 years ago, details hazy.
|
| Found the HN thread from the time though
| https://news.ycombinator.com/item?id=4453857
| lifeisstillgood wrote:
| Is there a good shortish course / book / site out there that
| would take a developer from "I get the ideas and happily use
| burp" to "so that's how pentesting works"?
|
| It seems a valuable way to spend the rapidly smaller training
| budgets
| yao420 wrote:
| The portswigger courses from the creators of burp and the
| employers of this researcher are the best. Free too.
| lifeisstillgood wrote:
| Makes me look a bit silly now ...
|
| thank you
| [deleted]
| [deleted]
___________________________________________________________________
(page generated 2023-09-20 23:02 UTC)