[HN Gopher] The AWS S3 Denial of Wallet Amplification Attack
       ___________________________________________________________________
        
       The AWS S3 Denial of Wallet Amplification Attack
        
       Author : croes
       Score  : 165 points
       Date   : 2024-05-01 19:00 UTC (4 hours ago)
        
 (HTM) web link (blog.limbus-medtec.com)
 (TXT) w3m dump (blog.limbus-medtec.com)
        
       | andersa wrote:
       | > Potential remedies
       | 
       | - Stop using S3 and other AWS (perhaps it stands for Amazon Web
       | Scams?) things already and switch to Cloudflare R2...
        
         | vuln wrote:
         | CIAWS
        
       | voidwtf wrote:
       | The way billing is calculated should be clearly labeled along
       | with the pricing. Azure does this too, it's super unclear what
       | metric they're using to determine what will be billed for
       | requests. We're having to find out via trial and error. If we
       | request 0-2GB on a 6GB file, but the client cancels after 400MB.
       | Are we paying 2GB or 400MB or 6GB?
       | 
       | Is there a billed difference between Range: 0-, no "Range"
       | header, and Range: 0-1GB if the client downloads 400MB in each
       | scenario?
        
         | __roland__ wrote:
         | Sorry for not having this made clearer (we'll fix this part of
         | the post): the gotcha is not that AWS does not honor range
         | requests, it's that canceling those will still add the full
         | range of bytes to your egress bill (and this can add up
         | quickly) although no bytes (or much fewer) have been
         | transferred.
        
           | alchemist1e9 wrote:
           | On the other hand you did ask for them so what does it mean
           | "canceling"? Just playing devil's advocate that they did
           | likely start getting the data for you and that takes
           | resources. Otherwise they would be open to a DOS attack that
           | initiates many requests and then cancels them.
        
             | __roland__ wrote:
             | Sure, that's true. The thing is: this was the same
             | requested (and cancelled) range on the same file(s), over
             | and over (it was a bug). Looking at this from the outside,
             | even some internal S3 caching should have had many cache
             | hits and not have to re-download the requested ranges
             | internally all the time (there were dozens of identical
             | requests per second, immediately being cancelled).
             | 
             | On top of this, S3 already bills (separately) for any
             | request against a bucket (see the other current issue with
             | the invalid PUT requests against a secured bucket, which
             | still got billed to the bucket owner;
             | https://news.ycombinator.com/item?id=40203126). So I'd say
             | both the requests and the cancellations were already paid
             | for; the surprise was the 'egress' cost on top, of data
             | that was not actually leaving the AWS network.
             | 
             | Still, you are right that this still consumes _some_
             | additional AWS resources, and it is probably a non-trivial
             | issue to fix in the  'billing system'.
        
       | CharlesW wrote:
       | "Thank you to everyone who brought this article to our attention.
       | We agree that customers should not have to pay for unauthorized
       | requests that they did not initiate. We'll have more to share on
       | exactly how we'll help prevent these charges shortly." -- Jeff
       | Barr, Chief Evangelist, Amazon Web Services
       | 
       | https://twitter.com/jeffbarr/status/1785386554372042890
        
         | wmf wrote:
         | Note that there are two separate issues being discussed.
        
         | gnabgib wrote:
         | Don't think that's the same problem - that's about the failed
         | puts still costing a dev money [0](272 points, 2 days ago, 99
         | comments).
         | 
         | [0]: https://news.ycombinator.com/item?id=40203126
        
       | wmf wrote:
       | "With range requests, the client can request to retrieve a part
       | of a file, but not the entire file. ... Due to the way AWS
       | calculates egress costs the transfer of the entire file is
       | billed." WTF if true.
        
         | fabian2k wrote:
         | That sounds egregious enough that I have trouble believing this
         | can be correct. My understanding is that AWS bills for egress
         | for every service, parts of the file that aren't transferred
         | are not part of this so can't be billed. There could certainly
         | be S3-specific charges that affect cases like this, no idea.
         | But if AWS bills the full egress traffic costs for a range
         | request I'd consider that essentially fraud.
        
           | belter wrote:
           | https://github.com/ZJONSSON/node-unzipper/issues/308
        
             | paulddraper wrote:
             | tl;dr
             | 
             | AWS user believes that testing on a 1Gbps connection for 45
             | min can't be more than $10 of egress.
             | 
             | Gets a $500 bill instead.
             | 
             | Note: This user specified a _lower_ range but not an
             | _upper_ range on the request (and closed the connection
             | prematurely). Essentially read() with an offset, for a ZIP
             | tool.
             | 
             | See also: https://news.ycombinator.com/item?id=40205213
        
               | yonixwm wrote:
               | So I guess the attack of the OP is a case AWS calculate
               | price based on the unbounded request header and not on
               | the actual egress
        
               | scottlamb wrote:
               | More or less. The article quotes AWS as saying the
               | following:
               | 
               | > Amazon S3 attempts to stop the streaming of data, but
               | it does not happen instantaneously.
               | 
               | ...which doesn't really explain it. It shouldn't send
               | more than a TCP window after the connection is closed,
               | and TCP windows are at most 1 GiB [1], usually much less,
               | so this completely fails to explain the article's
               | observed 3 TB sent vs 130 TB billed.
               | 
               | The article goes on to say:
               | 
               | > Okay, this is half the explanation. AWS customers are
               | not billed for the data actually transferred to the
               | Internet but instead for some amount of data that is
               | cached internally.
               | 
               | In other words, how much they bill really isn't bounded
               | by how much is sent at all. This is unacceptable.
               | 
               | [1] https://en.wikipedia.org/wiki/TCP_window_scale_option
        
               | easton wrote:
               | > this completely fails to explain the article's observed
               | 3 TB sent vs 130 TB billed
               | 
               | I interpreted that to be that their code was doing this
               | over and over again, so in total they retrieved 3TB over
               | a set of requests. Still horrifying, but mildly more
               | explainable.
        
               | slt2021 wrote:
               | this can be explained that this may be not egress out of
               | AWS, but egress out of S3 system itself.
               | 
               | S3 is a block storage, so retrieving an object for such a
               | high availability and high perf service means it tries to
               | pull some X block of data and cache it before sending
               | through the socket.
               | 
               | That X block of data is out of internal S3 storage, just
               | not sent through the bigger Internet egress subsystem.
               | 
               | So technically aws may argue this is egress for s3, just
               | not for aws
        
               | vermilingua wrote:
               | Then subsequent requests that hit the cache shouldn't be
               | charged by that logic.
        
               | slt2021 wrote:
               | s3 is a complex system, you could be hitting a different
               | node with subsequent requests where this cache entry does
               | not exist yet.
               | 
               | if you think egress is expensive, well storing data in
               | RAM for cache purposes is 1000000x more expensive
               | 
               | a lot of stuff could be happening. Main problem is AWS (i
               | think) is charging for egress out of S3 system, but
               | customers are looking at their ingress at client side and
               | there is mismatch
        
               | paulddraper wrote:
               | > AWS customers are not billed for the data actually
               | transferred to the Internet but instead for some amount
               | of data that is cached internally.
               | 
               | But egress fees only apply to S3 transfers outside the
               | AWS?
               | 
               | So which is it? Data transferred to the Internet? Or data
               | processed internally?
        
               | fabian2k wrote:
               | There is probably a small area where it's difficult to
               | measure, so I would not expect billing to be exact to the
               | byte here. But billing for the requested range if not the
               | entire range was actually transferred is just not correct
               | and not acceptable.
        
               | paulddraper wrote:
               | Certainly if you are charging for _internet egress_.
               | 
               | Like, charging for requests or internal data processing,
               | sure okay.
               | 
               | But this is a charge specifically for the _data
               | transferred from AWS to the internet_. So if you 're not
               | transferring data to the internet....
        
               | fabian2k wrote:
               | The part where I think there is some flexibility is about
               | the difference between "bytes attempted to transfer" and
               | "bytes actually transferred". I think it is pretty fair
               | to bill for the former, as long as you abort requests in
               | a reasonable way. So I don't expect it to be billed
               | exactly by the transferred byte, but I do expect it to
               | not go above that higher than whatever the chunk size for
               | transferring is.
        
               | paulddraper wrote:
               | Sure. In this case specifically AWS is attempting to
               | transfer 70Gbps through a 1Gbps pipe.
        
               | klabb3 wrote:
               | That's an orthogonal issue. There's no interpretation of
               | "egress" that means "stuff we do internally before
               | leaving aws data centers". If the tcp conn is reset only
               | a few MB would leave aws frontend servers. Instead, it
               | appears they've been basing the number off the range in
               | the request and/or whatever internal caching/loading
               | they're doing within S3, which again has nothing to do
               | with egress.
               | 
               | I mean, we already know egress is short for egregious.
               | It's an incredibly bad look to be overestimating the
               | "fuck you" part of the bill.
        
           | __roland__ wrote:
           | Sorry, I think that part of our write-up is misleading (I was
           | involved in analyzing the issue described here). To our best
           | understanding, what happens is the following:
           | 
           | - A client sends range requests and cancels them quickly.
           | 
           | - The full range request data will be billed (NOT the whole
           | file), so I think this should read that the entire requested
           | _range_ gets billed, even if it never gets transferred (the
           | explanation we received for this is that it 's due to some
           | internal buffering S3 is doing, and they do count this as
           | egress).
           | 
           | In any case, if you send and cancel such requests quickly
           | (which is easy enough, this was not even an adversarial
           | situation, just a bug in some client API code) the egress
           | cost is many times higher than your theoretical bandwidth
           | (and about 80x higher than in the AWS documentation, hence
           | the blogpost).
        
             | nijave wrote:
             | This is a problem with lots of services. Blocking large
             | quantities of legitimate looking requests is a hard
             | problem. Request cancellation is also tricky and not
             | supported well in a lot of frameworks/programming
             | languages.
        
         | nicklecompte wrote:
         | This must be a regression bug in AWS's internal system. At a
         | past job (2020) we used S3 to store a large amount of genomic
         | data, and a web application read range requests to visualize
         | tiny segments of the genetic sequence in relevant genes - like
         | 5kb out of 50GB. If AWS had billed the cost of an entire
         | genome/exome every time we did that, we would have noticed. I
         | monitored costs pretty closely, S3 was never a problem compared
         | to EC2.
         | 
         | It also seemed like the root cause was an _interrupted_ range
         | request (although I wasn 't fully clear on that). Even so that
         | seems like a recent regression. It took me ages to get that
         | stupid app working, I interrupted a lot of range requests :)
        
           | nielsole wrote:
           | S3 egress costs are free if the traffic stays within AWS.
           | Sounds like your clients were EC2 instances so this wouldn't
           | apply to you, would it?
        
             | mikepurvis wrote:
             | If it was a web application as stated in the GP, then it
             | would indeed be egress as the request would be coming from
             | a browser.
        
               | nicklecompte wrote:
               | Yes, it was client-side JavaScript making the range
               | requests, asking for a string of genomic data to render
               | in the browser. It was only to give the scientists a
               | pretty picture :) The EC2 costs were largely
               | ElasticSearch for a different function, which never
               | looked at the data in S3.
        
           | __roland__ wrote:
           | You are right, this is about _canceling_ range requests and
           | still getting billed, not about requesting ranges and getting
           | billed for the complete file egress. Sorry; we 'll make the
           | post clearer.
        
       | belter wrote:
       | https://news.ycombinator.com/item?id=40203126
       | 
       | https://news.ycombinator.com/item?id=40221108
        
         | itsdrewmiller wrote:
         | Those are about a different issue - not a great time for S3
         | billing!
        
           | belter wrote:
           | Correct. Its a like a game of negative chess. Wins who racks
           | the biggest bill, in the shortest amount of time, with the
           | least amount of activity :-)
        
       | andrewstuart wrote:
       | Why is anyone using S3 when Cloudflare R2 is free?
        
         | ezekiel68 wrote:
         | Because of "The Rise of Worse is Better" (search it) and
         | because a 900-lb industry gorilla is never displaced quickly or
         | easily.
        
         | surfingdino wrote:
         | Because of the other AWS services you get access to.
        
         | zedpm wrote:
         | Lots of reasons. My company started using AWS (and specifically
         | S3) something like 9 years ago; R2 wasn't even on the radar
         | back then. If I were starting from scratch today, I'd be
         | looking seriously at Cloudflare as a platform, but it's only in
         | the last year or two that they've offered these services that
         | would make it possible to build substantial applications.
        
         | waiwai933 wrote:
         | R2 bandwidth is free, but storage is not.
         | 
         | R2 also doesn't have all the features that S3 does - including
         | an equivalent of S3 Glacier, which is cheaper storage than R2.
         | R2 also doesn't have object tagging, object-level permissions,
         | or object locking. Sure, you could build your own layer in
         | front of R2 that gives you these features, but are you
         | necessarily saving money over just using S3?
        
       | bearjaws wrote:
       | Hate that it's essentially half ChatGPT generated. Especially
       | given the huge explanation of AWS.
        
         | tills13 wrote:
         | A "AI Generated" label would be nice, here.
        
           | anon373839 wrote:
           | These AI accusations are becoming a tired trope. What,
           | exactly, about the article gives you the impression that it
           | was generated by an LLM?
        
             | bakugo wrote:
             | If writers don't want people to think their content is AI
             | generated, maybe they shouldn't put ugly AI generated
             | images on top of everything they write.
        
               | anon373839 wrote:
               | Ah, so it's the illustrations?
        
           | flockonus wrote:
           | AI-fobic
        
         | __roland__ wrote:
         | I can assure you this was not AI-generated, apart from the
         | 'symbolic image' (which should be fairly obvious :).
         | 
         | Maybe that's just our non-native English shining through. In
         | any case, as a small European company in the healthcare space,
         | we are quite used to having to explain "the cloud" (with all
         | potential and pitfalls) to our customers. They are also (part
         | of) the target audience for this post, hence the additional
         | explanations.
         | 
         | (Not OP and not author of the article, but was involved in the
         | write-up.)
        
       | bennettnate5 wrote:
       | "Denial of Wallet" seems a misnomer--it makes it sound like
       | source of payment is being blocked. They should really use the
       | same term cellular systems have been for decades to describe this
       | kind of threat, namely an "overbilling attack".
        
         | KomoD wrote:
         | "Denial of Wallet" has been used in countless articles (incl.
         | academic) and places to refer to attacks that increase usage
         | bills.
        
       | akira2501 wrote:
       | We use CloudFront and we deny public users the ability to access
       | S3 directly. You can even use Signed URLs with CloudFront if you
       | like. I'm not sure I'd evere feel comfortable letting the public
       | at large hit my S3 endpoints.
        
         | INTPenis wrote:
         | As it should be, but recently on HN it was posted that AWS will
         | charge you for any unauthorized PUT request to your S3 buckets.
         | Meaning even 4xx errors will rack up a charge.
         | 
         | So your S3 bucket names must be hidden passphrases now that
         | stand between an attacker and your budget.
        
           | nijave wrote:
           | In all fairness, systems administrators have always had to
           | pay for unauthorized requests and systems to mitigate the
           | risk
           | 
           | The new thing is hyperscalers have so much capacity you can
           | get flooded by these long before the service degrades or goes
           | offline
        
             | kazen44 wrote:
             | Also, the cost of doing this per request is insane compared
             | to either absorbing or rate-limiting the bandwith the
             | requests take.
             | 
             | Cloud computing charges you by the request/byte/cpu cycle.
             | Servers do not have this issue.
             | 
             | Also, is it simply not possible to rate limit this on a per
             | IP basis? Make client only able to do X requests per second
             | from each unique IP/network flow.
        
               | nijave wrote:
               | >Cloud computing charges you by the request/byte/cpu
               | cycle. Servers do not have this issue.
               | 
               | Sure they do. Processing requests takes bandwidth, CPU,
               | memory, disk I/O
               | 
               | >Also, is it simply not possible to rate limit this on a
               | per IP basis
               | 
               | It's largely useless. You'll block any legitimate
               | bits/programs, people on CGNAT, people on corporate
               | networks & bad actors will use botnets, residential IPs,
               | VPNs to gain access to thousands or millions of unique
               | IPs
        
           | akira2501 wrote:
           | Wow. Okay. New horrors brought to us by the modern world
           | we've created.
           | 
           | Thankfully, it does look like AWS is appropriately
           | embarrassed over this, and is going to maybe do something.
           | 
           | https://twitter.com/jeffbarr/status/1785386554372042890
        
         | nijave wrote:
         | Direct S3 is pretty common for file distribution where latency
         | is less of a concern.
         | 
         | e.x. build an installer and distribute it, generate a report
         | and generate a signed url
        
       | Havoc wrote:
       | It's almost like the combination of public accessible + charged
       | per use + big cloud refusing to allow hardcaps on spend is a
       | terrible idea...
        
         | jsheard wrote:
         | Azure is probably the most egregious example of this, AWS and
         | GCP can at least _claim_ they have architectural barriers to
         | implementing a hard spending cap, but Azure _already has one_
         | and arbitrarily only allows certain subscription types to use
         | it. If you have a student account then you get a certain amount
         | of credit each month and if you over-spend it then most
         | services are automatically suspended until the next month,
         | unless you explicitly opt-out of the spending limit and commit
         | to paying the excess out of pocket. However if you have a
         | standard account you 're not allowed to set a spending limit
         | for, uh... reasons.
         | 
         | https://learn.microsoft.com/en-us/azure/cost-management-bill...
        
           | anonymousDan wrote:
           | AWS Educate has the same ability to impose a hard cap I
           | believe...
        
           | carbotaniuman wrote:
           | I guess it's a matter of students don't have money to spend
           | and bad optics, while a company might be cowed into paying
           | the bill.
        
           | dylan604 wrote:
           | That's insane as well. They already built the system, but you
           | just can't use it because we want the option for you to screw
           | up and pad our billing. There are many projects I've worked
           | on where a service not being available until the 1st of the
           | next month would not be anything more than a minor annoyance,
           | and would much rather that happen than an unexpected bill.
           | This is also something that I think would be a nice CYA tool
           | when developing something in the cloud for the first time.
           | It's easy to make a mistake when learning cloud services that
           | could be expensive like TFA shows.
        
       | tomp wrote:
       | how about the "PUT deny" attack?
       | 
       | AFAIK cannot be protected against
       | 
       | https://twitter.com/Lauramaywendel/status/178506487864384308...
        
         | zedpm wrote:
         | Jeff Barr posted that AWS is actively working on a resolution
         | for this:
         | https://twitter.com/jeffbarr/status/1785386554372042890 . Given
         | who he is, I take this as a strong indication that there will
         | be a reasonable fix in the near future.
        
       | lulznews wrote:
       | Does this apply to Cloudfront requests also?
        
       | KomoD wrote:
       | So much fluff, just get to the point.
       | 
       | At least 500-600 words weren't needed and just added noise to the
       | article, making it harder to read.
        
       | surfingdino wrote:
       | AWS APIs need a cleanup. I am constantly running into issues not
       | documented in the official doc, boto3 docs, or even on
       | StackOverflow. It's not even funny when a whole day goes by on
       | trying to figure out why I see nothing in the body of a 200 OK
       | response when I request data which I know is there in the bowels
       | of AWS. Then it turns out that one param doesn't allow values
       | below a certain number, even though the docs say otherwise.
        
         | Twirrim wrote:
         | Historically, they've been scared of versioning their APIs (not
         | many services have done it, dynamodb has, for example).
         | 
         | It leads to a "bad customer experience", having to update lots
         | of code, and also increases maintenance costs while you keep
         | two separate code paths functional.
         | 
         | There's a lot about the S3 API that would be changed, including
         | the response codes etc., if S3 engineers had freedom to change
         | it! I remember many conversations on the topic when I worked
         | alongside them in AWS.
        
           | andrewxdiamond wrote:
           | It's quite insane the levels of effort S3 engineers put in to
           | maintain perfect API compatibility. Even tiny details such as
           | whitespace or ordering have messed up project timelines and
           | blocked important launches.
        
             | Twirrim wrote:
             | All meeting some random arbitrary, maybe not even
             | conscious, decision made by an early S3 engineer when they
             | were implementing something.
        
       | adverbly wrote:
       | Sounds like they were using the Range header on large files. I
       | have made systems in the past using exactly this pattern(without
       | the intentionally dropped requests).
       | 
       | I hope this doesn't result in any significant changes as I really
       | liked using this pattern for sequential data processing of
       | potentially large blobs.
        
       | ignoreusernames wrote:
       | Early Athena (managed prestodb by AWS) had a similar bug when
       | measuring colunar file scans. If it touched the file, it
       | considered the whole file instead of just the column chunks read.
       | If I'm not mistaken, this was a bug on presto itself, but it was
       | a simple patch that landed on upstream a long time before we did
       | the tests. This was the first and only time we considered using a
       | relatively early AWS product. It was so bad that our half assed
       | self deployed version outperformed Athena by every metric that we
       | cared about
        
       ___________________________________________________________________
       (page generated 2024-05-01 23:01 UTC)