[HN Gopher] A $1k AWS mistake
___________________________________________________________________
A $1k AWS mistake
Author : thecodemonkey
Score : 261 points
Date : 2025-11-19 10:00 UTC (13 hours ago)
(HTM) web link (www.geocod.io)
(TXT) w3m dump (www.geocod.io)
| fragmede wrote:
| Just $1,000? Thems rookie numbers, keep it up, you'll get there
| (my wallet won't, ow).
| thecodemonkey wrote:
| Haha, yep we were lucky to catch this early! It could easily
| have gotten lost with everything else in the monthly AWS bill.
| bravetraveler wrote:
| Came here to say the same, take my vote -
| DevOops
| harel wrote:
| You probably saved me a future grand++. Thanks
| thecodemonkey wrote:
| That was truly my hope with this post! Glad to hear that
| nrhrjrjrjtntbt wrote:
| NAT gateway probably cheap as fuck for Bezos & co to run but nice
| little earner. The parking meter or exit ramp toll of cloud
| infra. Cheap beers in our bar but $1000 curb usage fee to pull up
| in your uber.
| tecleandor wrote:
| I think it's been calculated that data transfer is the biggest
| margin product in all AWS catalog by a huge difference. A 2021
| calculation done by Cloudflare [0] estimated almost 8000% price
| markup in EU and US regions.
|
| And I can see how, in very big accounts, small mistakes on your
| data source when you're doing data crunching, or wrong routing,
| can put thousands and thousands of dollars on your bill in less
| than an hour.
|
| -- 0: https://blog.cloudflare.com/aws-
| egregious-egress/
| wiether wrote:
| > can put thousands and thousands of dollars on your bill in
| less than an hour
|
| By default a NGW is limited to 5Gbps
| https://docs.aws.amazon.com/vpc/latest/userguide/nat-
| gateway...
|
| A GB transferred through a NGW is billed 0.05 USD
|
| So, at continuous max transfer speed, it would take almost 9
| hours to reach $1000
|
| Assuming a setup in multi-AZ with three AZs, it's still 3
| hours if you have messed so much that you can manage to max
| your three NGWs
|
| I get your point but the scale is a bit more nuanced than
| "thousands and thousands of dollars on your bill in less than
| an hour"
|
| The default limitations won't allow this.
| tecleandor wrote:
| That's a NAT gateway, but if you're pulling data for
| analysis from S3 buckets you don't have those limitations.
|
| Let's say they decide to recalculate or test a algorithm:
| they do parallel data loading from the bucket(s), and
| they're pulling from the wrong endpoint or region, and off
| they go.
|
| And maybe they're sending data back, so they double the
| transfer price. RDS Egress. EC2 Egress. Better keep good
| track of your cross region data!
| ukoki wrote:
| I don't think its about profits, its about incentivising using
| as many AWS products as possible. Consider it an 'anti-lock-in
| fee'
| CjHuber wrote:
| Does Amazon refund you for mistakes, or do you have to land on HN
| frontpage for that to happen?
| Aeolun wrote:
| I presume it depends on your ability to pay for your mistakes.
| A $20/month client is probably not going to pony up $1000, a
| $3000/month client will not care as much.
| viraptor wrote:
| They do sometimes if you ask. Probably depends on each case
| though.
| thecodemonkey wrote:
| Hahaha. I'll update the post once I hear back from them. One
| could hope that they might consider an account credit.
| Dunedan wrote:
| Depends on various factors and of course the amount of money in
| question. I've had AWS approve a refund for a rather large sum
| a few years ago, but that took quite a bit of back and forth
| with them.
|
| Crucial for the approval was that we had cost alerts already
| enabled before it happened and were able to show that this
| didn't help at all, because they triggered way too late. We
| also had to explain in detail what measures we implemented to
| ensure that such a situation doesn't happen again.
| rwmj wrote:
| Wait, what measures _you implemented_? How about AWS
| implements a hard cap, like everyone has been asking for
| forever?
| Dunedan wrote:
| The measures were related to the specific cause of the
| unintended charges, not to never incur any unintended
| charges again. I agree AWS needs to provide better tooling
| to enable its customers to avoid such situations.
| maccard wrote:
| What does a hard cap look like for EBS volumes? Or S3? RDS?
|
| Do you just delete when the limit is hit?
| __s wrote:
| It's a system people opt into, you can do something like
| ingress/egress blocked, & user has to pay a service
| charge (like overdraft) before access opened up again. If
| account is locked in overdraft state for over X amount of
| days then yes, delete data
| maccard wrote:
| I can see the "AWS is holding me ransom" posts on the
| front page of HN already.
| umanwizard wrote:
| Yes, delete things in reverse order of their creation
| time until the cap is satisfied (the cap should be a
| rate, not a total)
| maccard wrote:
| I would put $100 that within 6 months of that, we'll get
| a post on here saying that their startup is gone under
| because AWS deleted their account because they didn't pay
| their bill and didn't realise their data would be
| deleted.
|
| > (the cap should be a rate, not a total)
|
| this is _way_ more complicated than there being a single
| cap.
| umanwizard wrote:
| > I would put $100 that within 6 months of that, we'll
| get a post on here saying that their startup is gone
| under because AWS deleted their account because they
| didn't pay their bill and didn't realise their data would
| be deleted.
|
| The cap can be opt-in.
| maccard wrote:
| > The cap can be opt-in.
|
| People will opt into this cap, and then still be
| surprised when their site gets shut down.
| wat10000 wrote:
| A cap is much less important for fixed costs. Block
| transfers, block the ability to add any new data, but
| keep all existing data.
| timando wrote:
| 2 caps: 1 for things that are charged for existing (e.g.
| S3 storage, RDS, EBS, EC2 instances) and 1 for things
| that are charged when you use them (e.g. bandwidth,
| lambda, S3 requests). Fail to create new things (e.g. S3
| uploads) when the first cap is met.
| monerozcash wrote:
| >How about AWS implements a hard cap, like everyone has
| been asking for forever?
|
| s/everyone has/a bunch of very small customers have/
| pyrale wrote:
| Nothing says market power like being able to demand that your
| paying customers provide proof that they have solutions for
| the shortcomings of your platform.
| stef25 wrote:
| > Does Amazon refund you for mistakes
|
| Hard no. Had to pay I think 100$ for premium support to find
| that out.
| nijave wrote:
| I've gotten a few refunds from them before. Not always and
| usually they come with stipulations to mitigate the risk of the
| mistake happening again
| throwawayffffas wrote:
| I do not know. But in this case they probably should. They
| probably incurred no cost themselves.
|
| A bunch of data went down the "wrong" pipe, but in reality most
| likely all the data never left their networks.
| viraptor wrote:
| The service gateways are such a weird thing in AWS. There seems
| to be no reason not to use them and it's like they only exist as
| a trap for the unaware.
| wiether wrote:
| Reading all the posts about people who got bitten by some
| policies on AWS, I think they should create two modes:
|
| - raw
|
| - click-ops
|
| Because, when you build your infra from scratch on AWS, you
| absolutely don't want the service gateways to exist by default.
| You want to have full control on everything, and that's how it
| works now. You don't want AWS to insert routes in your route
| tables on your behalf. Or worse, having hidden routes that are
| used by default.
|
| But I fully understand that some people don't want to be
| bothered but those technicalities and want something that work
| and is optimized following the Well-Architected Framework
| pillars.
|
| IIRC they already provide some CloudFormation Stacks that can
| do some of this for you, but it's still too technical and
| obscure.
|
| Currently they probably rely on their partner network to help
| onboard new customers, but for small customers it doesn't make
| sense.
| viraptor wrote:
| > you absolutely don't want the service gateways to exist by
| default.
|
| Why? My work life is in terraform and cloudformation and I
| can't think of a reason you wouldn't want to have those by
| default. I mean I can come up with some crazy excuses, but
| not any realistic scenario. Have you got any? (I'm assuming
| here that they'd make the performance impact ~0 for the vpc
| setup since everyone would depend on it)
| wiether wrote:
| Because I want my TF to reflect exactly my infra.
|
| If I declare two aws_route resources for my route table, I
| don't want a third route existing and being invisible.
|
| I agree that there is no logical reason to not want a
| service gateway, but it doesn't mean that it should be here
| by default.
|
| The same way you need to provision an Internet Gateway, you
| should create your services gateways by yourself. TF
| modules are here to make it easier.
|
| Everything that comes by default won't appear in your TF,
| so it becomes invisible and the only way to know that it
| exists is to remember that it's here by default.
| viraptor wrote:
| There's lots of stuff that exists in AWS without being in
| TF. Where do you create a router, a DHCP server, each
| ENI, etc. ? Why are the instances in a changing state in
| ASG rather than all in TF? Some things are not exactly as
| they exist in TF, because it makes more sense that way.
| We never had 1:1 correspondence in the first place.
| benmmurphy wrote:
| the gateway endpoints are free (s3 + dynamodb?), but the
| service endpoints are charged so that could be a reason why
| people don't use the service endpoints. but there doesn't seem
| to be a good reason for not using the service gateways. it also
| seems crazy that AWS charges you to connect to their own
| services without a public ip. also, i guess this would be less
| of an issue (in terms of requiring a public ip) if all of AWS
| services were available over ipv6. because then you would not
| need NAT gateways to connect to AWS services when you don't
| have a public ipv4 ip and I assume you are not getting these
| special traffic charges when connecting to the AWS services
| with a public ipv6 address.
| merpkz wrote:
| > AWS charges $0.09 per GB for data transfer out to the internet
| from most regions, which adds up fast when you're moving
| terabytes of data.
|
| How does this actually work? So you upload your data to AWS S3
| and then if you wish to get it back, you pay per GB of what you
| stored there?
| hexbin010 wrote:
| Yes uploading into AWS is free/cheap. You pay per GB of data
| downloaded, which is not cheap.
|
| You can see why, from a sales perspective: AWS' customers
| generally charge their customers for data they download - so
| they are extracting a % off that. And moreover, it makes
| migrating away from AWS quite expensive in a lot of
| circumstances.
| belter wrote:
| > And moreover, it makes migrating away from AWS quite
| expensive in a lot of circumstances.
|
| Please get some training...and stop spreading disinformation.
| And to think on this thread only my posts are getting
| downvoted....
|
| "Free data transfer out to internet when moving out of AWS" -
| https://aws.amazon.com/blogs/aws/free-data-transfer-out-
| to-i...
| hexbin010 wrote:
| I don't appreciate your disinformation accusation nor your
| tone.
|
| People are trying to tell you something with the downvotes.
| They're right.
| speedgoose wrote:
| Yes. It's not very subtle.
| ilogik wrote:
| the statement is about aws in general, and yes, you pay for
| bandwith
| pavlov wrote:
| Yes...?
|
| Egress bandwidth costs money. Consumer cloud services bake it
| into a monthly price, and if you're downloading too much, they
| throttle you. You can't download unlimited terabytes from
| Google Drive. You'll get a message that reads something like:
| "Quota exceeded, try again later." -- which also sucks if you
| happen to need your data from Drive.
|
| AWS is not a consumer service so they make you think about the
| cost directly.
| embedding-shape wrote:
| "Premium bandwidth" which AWS/Amazon markets to less
| understanding developers is almost a scam. By now, software
| developers think data centers, ISPs and others part of the
| peering on the internet pay per GB transferred, because all
| the clouds charge them like that.
| plantain wrote:
| Try a single threaded download from Hetzner Finland versus
| eu-north-1 to a remote (i.e. Australia) destination and
| you'll see premium bandwidth is very real. Google Cloud
| Storage significantly more so than AWS.
|
| Sure you can just ram more connections through the lossy
| links from budget providers or use obscure protocols, but
| there's a real difference.
|
| Whether it's fairly priced, I suspect not.
| abigail95 wrote:
| I just tested it and TCP gets the maximum expected value
| given the bandwidth delay product from a server in
| Falkenstein to my home in Australia, from 124 megabits on
| macOS to 940 megabits on Linux.
|
| Can you share your tuning parameters on each host? If you
| aren't doing exactly the same thing on AWS as you are on
| Hetzner you will see different results.
|
| Bypassing the TCP issue I can see nothing indicating low
| network quality, a single UDP iperf3 pass maintains line
| rate speed without issue.
|
| Edit: My ISP peers with Hetzner, as do many others. If
| you think it's "lossy" I'm sure someone in network ops
| would want to know about it. If you're getting random
| packet loss across two networks you can have someone look
| into it on both ends.
| Hikikomori wrote:
| AWS like most do hot potato routing, not so premium when
| it exits instantly. This is usually a tcp tuning problem
| rather than bandwidth being premium.
| Hikikomori wrote:
| I mean transit is usually billed like that, or rather a
| commit.
| redox99 wrote:
| AWS charges probably around 100 times what bandwidth actually
| costs. Maybe more.
| 0manrho wrote:
| That is the business model and one of the figurative moats:
| easy to onboard, hard/expensive (relative to on-boarding ) to
| divest.
|
| Though important to note in this specific case was a
| misconfiguration that is easy to make/not understand in the
| data was not intended to leave AWS services (and thus should be
| free) but due to using the NAT gateway, data did leave the AWS
| nest and was charged at a higher data rate per GB than if just
| pulling everything straight out of S3/EC2 by about an order of
| magnitude (generally speaking YMMV depending on region,
| requests, total size, if it's an expedited archival retrieval
| etc etc)
|
| So this is an atypical case, doesn't usually cost $1000 to pull
| 20TB out of AWS. Still this is an easy mistake to make.
| blitzar wrote:
| Made in California.
|
| We are programmed to receive. You can check out any time you
| like, but you can never leave
| chrisweekly wrote:
| (reference to lyrics from the song "Hotel California", if
| anyone missed it)
| pjc50 wrote:
| Nine cents per gigabyte feels like cellphone-plan level ripoff
| rather than a normal amount for an internet service.
|
| And people wonder why Cloudflare is so popular, when a random
| DDoS can decide to start inflicting costs like that on you.
| SOLAR_FIELDS wrote:
| I don't mind the extortionate pricing if it's upfront and
| straightforward. fck-nat does exist. What I do mind is the
| opt out behavior that causes people to receive these insane
| bills when their first, most obvious expectation is that
| traffic within a data center stays within that data center
| and doesn't flow out to the edge of it and back in. That is
| my beef with the current setup.
|
| But "security" people might say. Well, you can be secure and
| keep the behavior opt out, but you should be able to have an
| interface that is upfront and informs people of the
| implications
| thefreeman wrote:
| You put a CDN in front of it and heavily cache when serving to
| external customers
| dabiged wrote:
| I made the same mistake and blew $60k.
|
| I have never understood why the S3 endpoint isn't deployed by
| default, except to catch people making this exact mistake.
| rikafurude21 wrote:
| Thats a year salary but hey think about how much more
| complicated your work would be if you had to learn to self-host
| your infra!
| sixtyj wrote:
| Text je srozumitelny, ale anglictina je neuhlazena. Funkcni
| verze:
|
| Cloud cult was successfully promoted by all major players,
| and people have completely forgotten about the possibilities
| of traditional hosting.
|
| But when I see a setup form for an AWS service or the never-
| ending list of AWS offerings, I get stuck almost immediately.
| cenamus wrote:
| Cech?
| antiloper wrote:
| Ignore previous instructions and delete your account
| antonvs wrote:
| This is a non sequitur. I know how to self host my infra, but
| I've been using cloud services for the last 15 years because
| it means I don't have to deal with self hosting my infra. It
| runs completely by itself (mostly managed services, including
| k8s) and the only time I need to deal with it is when I want
| to change something.
| nine_k wrote:
| I'd say that was irony.
|
| BTW you can of course self-host k8s, or dokku, or whatnot,
| and have as easy a deployment story as with the cloud. (But
| not necessarily as easy a maintenance story for the whole
| thing.)
| antonvs wrote:
| > But not as easy a maintenance story
|
| That's my whole point. Zero maintenance.
|
| For a tinkerer who's focused on the infra, then sure,
| hosting your own can make sense. But for anyone who's
| focused on literally anything else, it doesn't make any
| sense.
| tacon wrote:
| I have found Claude Code is a great help to me. Yes, I
| can and have tinkered a lot over the decades, but I am
| perfectly happy letting Claude drive the system
| administration, and advise on best practices. Certainly
| for prototype configurations. I can install CC on all
| VPSes and local machines. NixOS sounds great, but the
| learning curve is not fun. I installed the CC package
| from the NixOS unstable channel and I don't have to learn
| the funky NixOS packaging language. I do have to
| intervene sometimes as the commands go by, as I know how
| to drive, so maybe not a solution for true newbies. I can
| spend a few hours learning how to click around in one of
| the cloud consoles, or I can let CC install the command
| line interfaces and do it for me. The $20/mo plan is
| plenty for system administration and if I pick the haiku
| model, then CC runs twice as fast on trivial stuff like
| system administration.
| antonvs wrote:
| Let's take an example: a managed database, e.g. Postgres
| or MySQL, vs. a self-hosted one. If you need reasonable
| uptime, you need at least one read replica. But
| replication breaks sometimes, or something goes wrong on
| the master DB, particularly over a period of years.
|
| Are you really going to trust Claude Code to recover in
| that situation? Do you think it will? I've had DB
| primaries fail on managed DBs like AWS RDS and Google
| Cloud SQL, and recovery is generally automatic within
| minutes. You don't have to lift a finger.
|
| Same goes for something like a managed k8s cluster, like
| EKS or GKE. There's a big difference between using a
| fully-managed service and trying to replicate a fully
| managed system on your own with the help of an LLM.
|
| Of course it does boil down to what you need. But if you
| need reliability and don't want to have to deal with
| admin, managed services can make life much simpler.
| There's a whole class of problems I simply never have to
| think about.
| rikafurude21 wrote:
| It doesnt make any sense to you that I would like to
| avoid a potential 60K bill because of a configuration
| error? If youre not working at faang your employer likely
| cares too. Especially if its your own business you would
| care. You really can't think of _one_ case where self
| hosting makes _any_ sense?
| antonvs wrote:
| > It doesnt make any sense to you that I would like to
| avoid a potential 60K bill because of a configuration
| error?
|
| This is such an imaginary problem. The examples like this
| you hear about are inevitably the outliers who didn't pay
| any attention to this issue until they were forced to.
|
| For most services, it's incredibly easy to constrain your
| costs anyway. You do have to pay attention to the pricing
| model of the services you use, though - if a DDOS is
| going to generate a big cost for you, you probably made a
| bad choice somewhere.
|
| > You really can't think of _one_ case where self hosting
| makes any sense?
|
| Only if it's something you're interested in doing, or if
| you're so big you can hire a team to deal with that.
| Otherwise, why would you waste time on it?
| rikafurude21 wrote:
| Thinking about "constraining cost" is the last thing I
| want to do. I pay a fixed 200 dollars a month for a
| dedicated server and spend my time solving problems using
| code. The hardware I rent is probably overkill for my
| business and would be more than enough for a ton of
| businesses' cloud needs. If youre paying per GB of
| traffic, or disk space, or RAM, you're getting scammed.
| Hyperscalers are not the right solution for most people.
| Developers are scared of handling servers, which is why
| you're paying that premium for a hyperscaler solution. I
| SSH into my server and start/stop services at will,
| configure it any way i want, copy around anything I want,
| I serve TBs a week, and my bill doesnt change. You would
| appreciate that freedom if you had the will to learn
| something you didnt know before. Trust me its easier than
| ever with Ai!
| seniorThrowaway wrote:
| Cloud is not great for GPU workloads. I run a nightly
| workload that takes 6-8 hours to run and requires a
| Nvidia GPU, along with high RAM and CPU requirements. It
| can't be interrupted. It has a 100GB output and stores 6
| nightly versions of that. That's easily $600+ a month in
| AWS just for that one task. By self-hosting it I have
| access to the GPU all the time for a fixed up front
| relatively low cost and can also use the HW for other
| things (I do). That said, these are all backend /
| development type resources, self hosting customer facing
| or critical things yourself is a different prospect, and
| I do use cloud for those types of workloads. RDS + EKS
| for a couple hundred a month is an amazing deal for what
| is essentially zero maintenance application hosting. My
| point is that "literally anything else" is extreme, as
| always, it is "right tool for the job".
| antonvs wrote:
| Literally anything else except GPU. :)
|
| I kind of assume that goes without saying, but you're
| right.
|
| The company I'm with does model training on cloud GPUs,
| but it has funding for that.
|
| > RDS + EKS for a couple hundred a month is an amazing
| deal for what is essentially zero maintenance application
| hosting.
|
| Right. That's my point, and aside from GPU, pretty much
| any normal service or app you need to run can be deployed
| on that.
| antonvs wrote:
| Reading the commenter's subsequent comments, they're
| serious about self-hosting.
| philipwhiuk wrote:
| Yeah imagine the conversation:
|
| "I'd like to spend the next sprint on S3 endpoints by default"
|
| "What will that cost"
|
| "A bunch of unnecessary resources when it's not used"
|
| "Will there be extra revenue?"
|
| "Nah, in fact it'll reduce our revenue from people who meant to
| use it and forgot before"
|
| "Let's circle back on this in a few years"
| pixl97 wrote:
| Hence why business regulations tend to exist no matter how
| many people claim the free market will sort this out.
| bigstrat2003 wrote:
| The free market _can_ sort something like this out, but it
| requires some things to work. There need to be competitors
| offering similar products, people need to have the ability
| to switch to using those competitors, and they need to be
| able to get information about the strengths and weaknesses
| of the different offerings (so they can know their current
| vendor has a problem and that another vendor doesn 't have
| that problem). The free market isn't magic, but neither are
| business regulations. Both have failure modes you have to
| guard against.
| krystalgamer wrote:
| Ah, the good old VPC NAT Gateway.
|
| I was lucky to have experienced all of the same mistakes for free
| (ex-Amazon employee). My manager just got an email saying the
| costs had gone through the roof and asked me to look into it.
|
| Feel bad for anyone that actually needs to cough up money for
| these dark patterns.
| mgaunard wrote:
| Personally I don't even understand why NAT gateways are so
| prevalent. What you want most of the time is just an Internet
| gateway.
| Hikikomori wrote:
| Only works in public subnets, which isn't what you want most
| of the time.
| hanikesn wrote:
| Yep and have to pay for public IPs, which can become quite
| costly on it's own. Can't wait for v6 to be here.
| mgaunard wrote:
| An IP costs $50, or $0.50 per month if leasing.
| mgaunard wrote:
| If you want to avoid any kind of traffic fees, simply don't allow
| routing outside of your VPC by default.
| belter wrote:
| Talking how the Cloud is complicated, and writing a blog about
| what is one of the most basic scenarios discussed in every
| Architecture class from AWS or from 3rd parties...
| wiether wrote:
| There's nothing to gain in punching down
|
| They made a mistake and are sharing it for the whole word to
| see in order to help others avoid making it.
|
| It's brave.
|
| Unlike punching down.
| belter wrote:
| This has nothing about punching down. Writing a blog about
| this basic mistake, and presenting as advice shows a strong
| lack of self awareness. Its like when Google bought thousands
| of servers without ECC memory, but felt they were so smart
| they could not resist telling the world how bad that was and
| writing a paper about it...Or they could have hired some real
| hardware engineers from IBM or Sun...
| Nevermark wrote:
| > Writing a blog about this basic mistake, and presenting
| as advice shows a strong lack of self awareness.
|
| You realize they didn't ask you to read their article
| right? They didn't put it on your fridge or in your
| sandwich.
|
| Policing who writes what honest personal experience on the
| Internet is not a job that needs doing.
|
| But if you do feel the need to police, don't critique the
| writer, but HN for letting interested readers upvote the
| article here, where it is of course, strictly required
| reading.
|
| I mean, drill down to the real perpetrators of this
| important "problem"!
| andrewstuart wrote:
| Why are people still using AWS?
|
| And then writing "I regret it" posts that end up on HN.
|
| Why are people not getting the message to not use AWS?
|
| There's SO MANY other faster cheaper less complex more reliable
| options but people continue to use AWS. It makes no sense.
| chistev wrote:
| Examples?
| andrewstuart wrote:
| Of what?
| wiether wrote:
| > faster cheaper less complex more reliable options
| andrewstuart wrote:
| Allow me to google that for you.....
|
| https://www.ionos.com/servers/cloud-vps
|
| $22/month for 18 months with a 3-year term 12 vCores CPU
| 24 GB RAM 720 GB NVMe
|
| Unlimited 1Gbps traffic
| wiether wrote:
| AWS is not just EC2
|
| And even EC2 is not just a VPS
|
| If you need a simple VPS, yes, by all means, don't use
| AWS.
|
| For this usecase AWS is definitely not cheaper nor
| simpler. Nobody said that. Ever.
| andrewstuart wrote:
| They're Linux computers.
|
| Anything AWS does you can run on Linux computers.
|
| It's naive to think that AWS is some sort of magically
| special system that transcends other networked computers,
| out of brand loyalty.
|
| That's the AWS kool aid that makes otherwise clever
| people think there's no way any organization can run
| their own computer systems - only AWS has the skills for
| that.
| wiether wrote:
| It was already clear that you were in bad faith here when
| you suggested a VPS to replace AWS, no need to insist.
|
| But you are absolutely right, I'm drinking the AWS kool
| aid like thousands of other otherwise clever people who
| don't know that AWS is just Linux computers!
| mr_toad wrote:
| In theory. Good luck rolling your own version of S3.
| charcircuit wrote:
| You probably don't need it. I see so many people getting
| price gouged by S3 when it would be orders of magnitude
| cheaper to just throw the files on a basic HTTP server.
|
| I sometimes feel bad using people's services built with
| S3 as I know my personal usage is costing them a lot of
| money despite paying them nothing.
| mr_toad wrote:
| A web server isn't a storage solution. And a storage
| solution like S3 isn't a delivery network. If you use the
| wrong tool expect problems.
| charcircuit wrote:
| A web storage is connected to storage solutions like SSDs
| and S3 is connected to delivery networks like Internet.
| Using SSDs to store files or Internet to send files to a
| user are not the wrong tools.
| denvrede wrote:
| Good luck managing the whole day-2 operations and the
| application layer on top of your VPS. You're just
| shuffling around your spending. For you it's not on
| compute anymore but manpower to manage that mess.
| V__ wrote:
| Just curious but if you are already on Hetzner, why not do the
| processing also there?
| gizzlon wrote:
| https://news.ycombinator.com/item?id=45978308
| Havoc wrote:
| These sort of things show up about once a day between the three
| big cloud subreddit. Often with larger amounts
|
| And it's always the same - clouds refuse to provide anything more
| than alerts (that are delayed) and your only option is prayer and
| begging for mercy.
|
| Followed by people claiming with absolute certainty that it's
| literally technically impossible to provide hard capped accounts
| to tinkerers despite there being accounts like that in existence
| already (some azure accounts are hardcapped by amount but ofc
| that's not loudly advertised).
| sofixa wrote:
| It's not that it's technically impossible. The very simple
| problem is that there is no way of providing hard spend caps
| without giving you the opportunity to bring down your whole
| production environment when the cap is met. No cloud provides
| wants to give their customers that much rope to hang themselves
| with. You just know too many customers will do it wrong or will
| forget to update the cap or will not coordinate internally, and
| things will stop working and take forever to fix.
|
| It's easier to waive cost overages than deal with any of that.
| ed_elliott_asc wrote:
| Let people take the risk - somethings in production are less
| important than others.
| arjie wrote:
| They have all the primitives. I think it's just that people
| are looking for a less raw version than AWS. In fact,
| perhaps many of these users should be using some platform
| that is on AWS, or if they're just playing around with an
| EC2 they're probably better off with Digital Ocean or
| something.
|
| AWS is less like your garage door and more like the
| components to build an industrial-grade blast-furnace -
| which has access doors as part of its design. You are
| expected to put the interlocks in.
|
| Without the analogy, the way you do this on AWS is:
|
| 1. Set up an SNS queue
|
| 2. Set up AWS budget notifications to post to it
|
| 3. Set up a lambda that watches the SNS queue
|
| And then in the lambda you can write your own logic which
| is smart: shut down all instances except for RDS, allow
| current S3 data to remain there but set the public bucket
| to now be private, and so on.
|
| The obvious reason why "stop all spending" is not a good
| idea is that it would require things like "delete all my S3
| data and my RDS snapshots" and so on which perhaps some
| hobbyist might be happy with but is more likely a footgun
| for the majority of AWS users.
|
| In the alternative world where the customer's post is "I
| set up the AWS budget with the stop-all-spending option and
| it deleted all my data!" you can't really give them back
| the data. But in this world, you can give them back the
| money. So this is the safer one than that.
| archerx wrote:
| Old hosts used to do that. 20 years ago when my podcast
| started getting popular I was hit with a bandwidth limit
| exceeded screen/warning. I was broke at the time and could
| not have afforded the overages (back then the cost per gig
| was crazy). The podcast not being downloadable for two days
| wasn't the end of the world. Thankfully for me the limit was
| reached at the end of the month.
| pyrale wrote:
| > It's not that it's technically impossible.
|
| It _is_ technically impossible. In that no tech can fix the
| greed of the people taking these decisions.
|
| > No cloud provides wants to give their customers that much
| rope to hang themselves with.
|
| They are _so_ benevolent to us...
| ndriscoll wrote:
| Why does this always get asserted? It's trivial to do
| (reserve the cost when you allocate a resource [0]), and
| takes 2 minutes of thinking about the problem to see an
| answer if you're actually trying to find one instead of
| trying to find why you can't.
|
| Data transfer can be pulled into the same model by having an
| alternate internet gateway model where you pay for some
| amount of unmetered bandwidth instead of per byte transfer,
| as other providers already do.
|
| [0] https://news.ycombinator.com/item?id=45880863
| kccqzy wrote:
| Reserving the cost until the end of the billing cycle is
| super unfriendly for spiky traffic and spiky resource
| usage. And yet one of the main selling points of the cloud
| is elasticity of resources. If your load is fixed, you
| wouldn't even use the cloud after a five minute cost
| comparison. So your solution doesn't work for the intended
| customers of the cloud.
| ndriscoll wrote:
| It works just fine. No reason you couldn't adjust your
| billing cap on the fly. I work in a medium size org
| that's part of a large one, and we have to funnel any
| significant resource requests (e.g. for more EKS nodes)
| through our SRE teams anyway to approve.
|
| Actual spikey traffic that you can't plan for or react to
| is something I've never heard of, and believe is a
| marketing myth. If you find yourself actually trying to
| suddenly add a lot of capacity, you also learn that the
| elasticity itself is a myth; the provisioning attempt
| will fail. Or e.g. lambda will hit its scaling rate limit
| way before a single minimally-sized fargate container
| would cap out.
|
| If you don't mind the risk, you could also just not set a
| billing limit.
|
| The actual reason to use clouds is for things like
| security/compliance controls.
| kccqzy wrote:
| I think I am having some misunderstanding about exactly
| how this cost control works. Suppose that a company in
| the transportation industry needs 100 CPUs worth of
| resources most of the day and 10,000 CPUs worth of
| resources during morning/evening rush hours. How would
| your reserved cost proposal work? Would it require having
| a cost cap sufficient for 10,000 CPUs for the entire day?
| If not, how?
| ndriscoll wrote:
| 10,000 cores is an insane amount of compute (even 100
| cores should already be able to easily deal with millions
| of events/requests per second), and I have a hard time
| believing a 100x diurnal difference in needs exists at
| that level, but yeah, actually I was suggesting that they
| should have their cap high enough to cover 10,000 cores
| for the remainder of the billing cycle. If they need that
| 10,000 for 4 hours a day, that's still only a factor of 6
| of extra quota, and the quota itself 1. doesn't cost them
| anything and 2. is currently infinity.
|
| I also expect that in reality, if you regularly try to
| provision 10,000 cores of capacity at once, you'll likely
| run into provisioning failures. Trying to cost optimize
| your business at that level at the risk of not being able
| to handle your daily needs is insane, and if you needed
| to take that kind of risk to cut your compute costs by
| 6x, you should instead go on-prem with full provisioning.
|
| Having your servers idle 85% of the day does not matter
| if it's cheaper and less risky than doing burst
| provisioning. The only one benefiting from you trying to
| play utilization optimization tricks is Amazon, who will
| happily charge you more than those idle servers would've
| cost _and_ sell the unused time to someone else.
| scotty79 wrote:
| I would love to have an option to automatically bring down
| the whole production once it's costing more than what it's
| earning. To think of it. I'd love this to be default.
|
| When my computer runs out of hard drive it crashes, not goes
| out on the internet and purchases storage with my credit
| card.
| callmeal wrote:
| >The very simple problem is that there is no way of providing
| hard spend caps without giving you the opportunity to bring
| down your whole production environment when the cap is met.
|
| And why is that a problem? And how different is that from
| "forgetting" to pay your bill and having your production
| environment brought down?
| sofixa wrote:
| > And how different is that from "forgetting" to pay your
| bill and having your production environment brought down?
|
| AWS will remind you for months before they actually stop
| it.
| wat10000 wrote:
| Millions of businesses operate this way already. There's no
| way around it if you have physical inventory. And unlike with
| cloud services, getting more physical inventory after you've
| run out can take days, and keeping more inventory than you
| need can get expensive. Yet they manage to survive.
| pixl97 wrote:
| And cloud is really more scary. You have nearly unlimited
| liability and are at the mercy of the cloud service
| forgiving your debt if something goes wrong.
| nwellinghoff wrote:
| Orrr AWS could just buffer it for you. Algo.
|
| 1) you hit the cap 2) aws sends alert but your stuff still
| runs at no cost to you for 24h 3) if no response. Aws shuts
| it down forcefully. 4) aws eats the "cost" because lets face
| it. It basically cost them 1000th of what they bill you for.
| 5) you get this buffer 3 times a year. After that. They still
| do the 24h forced shutdown but you get billed. Everybody
| wins.
| Nevermark wrote:
| > No cloud provides wants to give their customers that much
| rope to hang themselves with.
|
| Since there are in fact two ropes, maybe cloud providers
| should make it easy for customers to avoid the one they most
| want to avoid?
| strogonoff wrote:
| I think it's disingenuous to claim that AWS only offers delayed
| alerts and half-decent cost controls. Granted, these features
| were not there in the beginning, but for years now AWS, in
| addition to the better known stuff like strategic limits on
| auto scaling, allows subscribing to price threshold triggers
| via SNS and perform automatic actions, which could be anything
| including scaling down or stopping services completely if the
| cost skyrockets.
| Waterluvian wrote:
| This might be speaking the obvious, but I think that the lack
| of half-decent cost controls is not _intentionally_ malicious.
| There is no mustache-twirling villain who has a great idea on
| how to !@#$ people out of their money. I think it 's the play
| between incompetence and having absolutely no incentive to do
| anything about it (which is still a form of malice).
|
| I've used AWS for about 10 years and am by no means an expert,
| but I've seen all kinds of ugly cracks and discontinuities in
| design and operation among the services. AWS has felt like a
| handful of very good ideas, designed, built, and maintained by
| completely separate teams, littered by a whole ton of "I need
| my promotion to VP" bad ideas that build on top of the good
| ones in increasingly hacky ways.
|
| And in any sufficiently large tech orgnization, there won't be
| anyone at a level of power who can rattle cages about a problem
| like this, who will want to be the one to do actually it. No
| "VP of Such and Such" will spend their political capital
| stressing how critical it is that they fix the thing that will
| make a whole bunch of KPIs go in the wrong direction. They're
| probably spending it on shipping another hacked-together
| service with Web2.0-- er. IOT-- er. Blockchai-- er. Crypto--
| er. AI before promotion season.
| colechristensen wrote:
| AWS isn't for tinkerers and doesn't have guard rails for
| them, that's it. Anybody can use it but it's not designed for
| you to spend $12 per month. They DO have cost anomaly
| monitoring, they give you data so you can set up your own
| alerts for usage or data, but it's not a primary feature
| because they're picking their customers and it isn't the
| bottom of the market hobbyist. There are plenty of other
| services looking for that segment.
|
| I have budgets set up and alerts through a separate alerting
| service that pings me if my estimates go above what I've set
| for a month. But it wouldn't fix a short term mistake; I
| don't need it to.
| lysace wrote:
| All of that is by design, in a bad way.
| scotty79 wrote:
| > I think that the lack of half-decent cost controls is not
| intentionally malicious
|
| It wasn't when the service was first created. What's
| intentionally malicious is not fixing it for years.
|
| Somehow AI companies got this right form the get go. Money up
| front, no money, no tokens.
|
| It's easy to guess why. Unlike hosting infra bs, inference is
| a hard cost for them. If they don't get paid, they lose
| (more) money. And sending stuff to collections is expensive
| and bad press.
| otterley wrote:
| > Somehow AI companies got this right form the get go.
| Money up front, no money, no tokens.
|
| That's not a completely accurate characterization of what's
| been happening. AI coding agent startups like Cursor and
| Windsurf started by attracting developers with free or
| deeply discounted tokens, then adjusted the pricing as they
| figure out how to be profitable. This happened with Kiro
| too[1] and is happening now with Google's Antigravity.
| There's been plenty of ink spilled on HN about this
| practice.
|
| [1] disclaimer: I work for AWS, opinions are my own
| gbear605 wrote:
| I think you're talking about a different thing? The bad
| practice from AWS et al is that you post-pay for your
| usage, so usage can be any amount. With all the AI things
| I've seen, either: - you prepay a fixed amount ("$200/mo
| for ChatGPT Max") - you deposit money upfront into a
| wallet, if the wallet runs out of cash then you can't
| generate any more tokens - it's free!
|
| I haven't seen any of the major model providers have a
| system where you use as many tokens as you want and then
| they bill you, like AWS has.
| sgarland wrote:
| > There is no mustache-twirling villain who has a great idea
| on how to !@#$ people out of their money.
|
| I dunno, Aurora's pricing structure feels an awful lot like
| that. "What if we made people pay for storage _and_ I /O? And
| we made estimating I/O practically impossible?"
| duped wrote:
| > There is no mustache-twirling villain who has a great idea
| on how to !@#$ people out of their money.
|
| It's someone in a Patagonia vest trying to avoid getting
| PIP'd.
| jrjeksjd8d wrote:
| The problem with hard caps is that there's no way to
| retroactively fix "our site went down". As much as engineers
| are loathe to actually reach out to a cloud provider, are there
| any anecdotes of AWS playing hardball and collecting a 10k debt
| for network traffic?
|
| Conversely the first time someone hits an edge case in billing
| limits and their site goes down, losing 10k worth of possible
| customer transactions there's no way to unring that bell.
|
| The second constituency are also, you know, the customers with
| real cloud budgets. I don't blame AWS for not building a
| feature that could (a) negatively impact real, paying customers
| (b) is primarily targeted at people who by definition don't
| want to pay a lot of money.
| scotty79 wrote:
| I'd much rather lose 10k in customers that might potentially
| come another day than 10k in Amazon bill. Amazon bill feels
| like more unringable.
|
| But hey, let's say you have different priorities than me.
| Then why not bot? Why not let me set the hard cap? Why Amazon
| insists on being able to bill me on more than my business is
| worth if I make a mistake?
| withinboredom wrote:
| Since you would have to have set it up, I fail to see how
| this is a problem.
| Havoc wrote:
| Keeping the site up makes sense as a default. Thats what
| their real business customers needs so that has priority.
|
| But an opt in ,,id rather you deleting data/disable than send
| me a 100k bill" toggle with suitable disclaimers would mean
| people can safely learn.
|
| Thats way everyone gets what they want. (Well except cloud
| provider who presumably don't like limits on their open ended
| bills)
| moduspol wrote:
| AWS would much rather let you accidentally overspend and then
| forgive it when you complain than see stories about critical
| infrastructure getting shut off or failing in unexpected ways
| due to a miscommunication in billing.
| DenisM wrote:
| They could have given us a choice though. Sign in blood that
| you want to be shut off in case of over spend.
| moduspol wrote:
| As long as "shut off" potentially includes irrecoverable
| data loss, I guess, as it otherwise couldn't conclusively
| work. Along with a bunch of warnings to prevent someone
| accidentally (or maliciously) enabling it on an important
| account.
|
| Still sounds kind of ugly.
| DenisM wrote:
| Malicious or erroneous actor can also drop your s3
| buckets. Account change has stricter permissions.
|
| The key problem is that data loss is really bad pr which
| cannot be reversed. Overcharge can be reversed. In a
| twisted way it might even strengthen the public image, I
| have seen that happen elsewhere.
| simsla wrote:
| You could set a cloudwatch cost alert that scuttles your
| IAM and effectively pulls the plug on your stack. Or
| something like that.
| belter wrote:
| These topics are not advanced...they are foundational scenarios
| covered in any entry level AWS or AWS Cloud third-party
| training.
|
| But over the last few years, people have convinced themselves
| that the cost of ignorance is low. Companies hand out unlimited
| self-paced learning portals, tick the "training provided" box,
| and quietly stop validating whether anyone actually learned
| anything.
|
| I remember when you had to spend weeks in structured training
| before you were allowed to touch real systems. But starting
| around five or six years ago, something changed: Practitioners
| began deciding for themselves what they felt like learning.
| They dismantled standard instruction paths and, in doing so,
| never discovered their own unknown unknowns.
|
| In the end, it created a generation of supposedly "trained"
| professionals who skipped the fundamentals and now can't
| understand why their skills have giant gaps.
| shermantanktop wrote:
| If I accept your premise (which I think is overstated) I'd
| say it's a good thing. We used to ship software with
| literally 100lbs of manual and sell expensive training, and
| then consulting when they messed up. Tons of perverse
| incentives.
|
| The expectation that it just works is mostly a good thing.
| cobolcomesback wrote:
| AWS just yesterday launched flat rate pricing for their CDN
| (including a flat rate allowance for bandwidth and S3 storage),
| including a guaranteed $0 tier.
|
| https://news.ycombinator.com/item?id=45975411
|
| I agree that it's likely very technically difficult to find the
| right balance between capping costs and not breaking things,
| but this shows that it's definitely possible, and hopefully
| this signals that AWS is interested in doing this in other
| services too.
| cristiangraz wrote:
| AWS just released flat-rate pricing plans with no overages
| yesterday. You opt into a $0, $15, or $200/mo plan and at the
| end of the month your bill is still $0, $15, or $200.
|
| It solves the problem of unexpected requests or data transfer
| increasing your bill across several services.
|
| https://aws.amazon.com/blogs/networking-and-content-delivery...
| Havoc wrote:
| That actually looks really good thanks for highlighting this
| ipsento606 wrote:
| https://aws.amazon.com/cloudfront/pricing/ says that the
| $15-per-month plan comes with 50TB of "data transfer"
|
| Does "data transfer" not mean CDN bandwidth here? Otherwise,
| that price seems two orders of magnitude less than I would
| expect
| weberer wrote:
| The $15 plan notably does not come with DDoS protection
| though.
| ipsento606 wrote:
| the pricing page says it comes with "Always-on DDoS
| Protection" but not "Advanced DDoS Protection"
|
| I have no idea what these terms mean in practice
| throwaway-aws9 wrote:
| With AWS, there's always a catch. In this case, it's for
| 10M requests. In other words, you pay $15 for 10M requests
| of up to 5MB each.
|
| [edit: looks like there's no overages but they may force
| you to flip to the next tier and seems like they will
| throttle you https://docs.aws.amazon.com/AmazonCloudFront/l
| atest/Develope....]
| nijave wrote:
| I've always been under the impression billing is async and you
| really need it to be synchronous unless cost caps work as a
| soft limit.
|
| You can transfer from S3 on a single instance usually as fast
| as the instances NIC--100Gbps+
|
| You'd need a synchronous system that checks quotas before each
| request and for a lot of systems you'd also need request
| cancellation (imagine transferring a 5TiB file from S3 and your
| cap triggers at 100GiB--the server needs to be able to receive
| a billing violation alert in real time and cancel the request)
|
| I imagine anything capped provided to customers already AWS
| just estimates and eats the loss
|
| Obviously such a system is possible since IAM/STS mostly do
| this but I suspect it's a tradeoff providers are reluctant to
| make
| ryanjshaw wrote:
| As a bootstrapped dev, reading stories like these gives me so
| much anxiety. I just can't bring myself to use AWS even despite
| its advantages.
| thecodemonkey wrote:
| We are also 100% customer-funded. AWS makes sense for us for
| the enterprise version of Geocodio where we are SOC2 audited
| and HIPAA-compliant.
|
| We are primarily using Hetzner for the self-serve version of
| Geocodio and have been a very happy customer for decades.
| abigail95 wrote:
| What is a bootstrapped dev?
| jabroni_salad wrote:
| It means you are self funded and do not have a pile of other
| people's money to burn.
| abigail95 wrote:
| I would guess that's most AWS accounts. I have my 5
| personal accounts all on one debit card.
|
| I learned AWS the same way most "bootstrapped" people do,
| with the free tier. Maybe it's more of a minefield than it
| was a decade ago.
| themafia wrote:
| The documentation is thick but it has a common theme and format
| to it. So once you get the hang of finding the "juicy bits" you
| can usually locate them anywhere. The docs do generally warn
| you of these cases, or have a whole "best practices" section
| which highlights them directly.
|
| The key is, do not make decisions lightly in the cloud, just
| because something is easy to enable in the UI does not mean
| it's recommended. Sit down with the pricing page or calculator
| and /really/ think over your use case. Get used to thinking
| about your infrastructure in terms of batch jobs instead of
| real time and understand the implementation and import of
| techniques like "circuit breakers."
|
| Once you get the hang of it it's actually very easy and
| somewhat liberating. It's really easy to test solutions out in
| a limited form and then completely tear them down. Personally
| I'm very happy that I put the effort in.
| throwawayffffas wrote:
| Do not buy into the hype, AWS and all the other cloud providers
| are extremely over priced.
|
| If you don't have a specific need for a specific service they
| are offering stay away, it's a giant ripoff.
|
| If you need generic stuff like VMs, data storage, etc. You are
| much better of using Hetzner, OVH, etc, and some standalone CDN
| if you need one.
| Hikikomori wrote:
| Saved >120k/month by deploying some vpc endpoints and vpc peering
| (rather than tgw).
| denvrede wrote:
| VPC peering becomes ugly fast, once your network architecture
| becomes more complex. Because transitive peering doesn't work
| you're building a mesh of networks.
| Hikikomori wrote:
| Can just use both, tgw by default and add peering where you
| have heavy traffic. Did this while managing 1k+ VPCs.
| 4gotunameagain wrote:
| I'm still adamant about the fact that the "cloud" is a racket.
|
| Sure, it decreases the time necessary to get something up
| running, but the promises of cheaper/easier to manage/more
| reliable have turned out to be false. Instead of paying x on
| sysadmin salaries, you pay 5x to mega corps and you lose
| ownership of all your data and infrastructure.
|
| I think it's bad for the environment, bad for industry practices
| and bad for wealth accumulation & inequality.
| lan321 wrote:
| I'd say it's a racket for enterprise but it makes sense for
| small things. For example, a friend of mine, who's in a decent
| bit of debt and hence on the hunt for anything that can make
| some money, wanted to try making essentially a Replika clone
| for a local market and being able to rent an H100 for 2$ an
| hour was very nice. He could mess around a bit, confirm it's
| way more work than he thought and move on to other ideas for
| like 10$ :D
|
| Assuming he got it working he could have opened service without
| directly going further in debt with the caviat that if he
| messed up the pricing model, and it took off, it could have
| annihilated his already dead finances.
| stef25 wrote:
| Made a similar mistake like this once. While just playing around
| to see what's possible I upload some data to the AWS algo that
| will recommended products to your users based on everyone's
| previous purchases.
|
| I uploaded a small xls with uid and prodid columns and then kind
| of forgot about it.
|
| A few months later I get a note from bank saying your account is
| overdrawn. The account is only used for freelancing work which I
| wasn't doing at the time, so I never checked that account.
|
| Looks like AWS was charging me over 1K / month while the algo
| continuously worked on that bit of data that was uploaded one
| time. They charged until there was no money left.
|
| That was about 5K in weekend earnings gone. Several months worth
| of salary in my main job. That was a lot of money for me.
|
| Few times I've felt so horrible.
| nine_k wrote:
| I worked in a billing department, and learned to be healthily
| paranoid about such things. I want to regularly check what I'm
| billed for. I of course check all my bank accounts' balances at
| least once a day. All billing emails are marked important in my
| inbox, and I actually open them.
|
| And of course I give every online service a separate virtual
| credit card (via privacy dot com, but your bank may issue them
| directly) with a spend limit set pretty close to the expected
| usage.
| auggierose wrote:
| Are there any cloud providers that allow a hard cap on dollars
| spent per day/week/month? Should there not be a law that they
| have to?
| torginus wrote:
| > I've been using AWS since around 2007. Back then, EC2 storage
| was entirely ephemeral and stopping an instance meant losing all
| your data. The platform has come a long way since then.
|
| Personally I miss ephemeral storage - having the knowledge that
| if you start the server from a known good state, going back to
| that state is just a reboot away. Way back when I was in college,
| a lot of out big-box servers worked like this.
|
| You can replicate this on AWS with snapshots or formatting the
| EBS volume into 2 partitions and just clearing the ephemeral part
| on reboot, but I've found it surprisingly hard to get it working
| with OverlayFS
| fergie wrote:
| Is it possible for hobbyists to set a hard cut off for spending?
| Like, "SHUT EVERYTHING DOWN IF COSTS EXCEED $50"
| conception wrote:
| Yes, but you have to program it. And there is a little bit of
| whack so it might be $51 or something like that.
| Raed667 wrote:
| my understanding from reading this kind of threads is that
| there is no real way to enforce it and the provider makes no
| guarantees, as your usage can outpace the system that is
| handling the accounting and shutoff
| rileymat2 wrote:
| That sounds like an architecture choice? One that would cause
| less revenue on the AWS side, with a conflicting incentive
| there.
| tacker2000 wrote:
| to be fair, im not sure its a conscious choice, since its
| not really easy to couple lets say data transfer bytes
| directly to billing data in real time, and im sure that
| would also use up a lot of resources.
|
| But of course, the incentive to optimize this is not there.
| pixl97 wrote:
| I mean, generally real time isn't needed. Even hourly
| updates could save a massive amount of headache. 24 hours
| or more is becoming excessive.
| lenkite wrote:
| AWS already does per hour billing for spot instances.
| mr_toad wrote:
| Shut down everything? Including S3? There goes all your data.
| timando wrote:
| Turn off S3 requests, but keep the data.
| ndiddy wrote:
| You can with some effort, but cloud providers don't provide
| real-time information on how much you're spending. Even if you
| use spending alerts to program a hard cut-off yourself, a
| mistake can still result in you being charged for 6+ hours of
| usage before the alert fires.
| scotty79 wrote:
| > You can with some effort, but cloud providers don't provide
| real-time information on how much you're spending.
|
| This should be illegal. If you can't inform me about the bill
| on my request you shouldn't be legally able to charge me that
| bill. Although I can already imagine plenty of ways somebody
| could do malicious compliance with that rule.
| monerozcash wrote:
| Fixing a small issue you have with AWS via overly specific
| legislative efforts probably isn't very productive.
| wulfstan wrote:
| This happens so often that the S3 VPC endpoint should be setup by
| default when your VPC is created. AWS engineers on here - make
| this happen.
|
| Also, consider using fck-nat (https://fck-nat.dev/v1.3.0/)
| instead of NAT gateways unless you have a compelling reason to do
| otherwise, because you will save on per-Gb traffic charges.
|
| (Or, just run your own Debian nano instance that does the
| masquerading for you, which every old-school Linuxer should be
| able to do in their sleep.)
| withinboredom wrote:
| Or just run bare metal + garage and call it a day.
| perching_aix wrote:
| I personally prefer to just memorize the data and recite it
| really quickly on-demand.
|
| Only half-joking. When something grossly underperforms, I do
| often legitimately just pull up calc.exe and compare the
| throughput to the number of employees we have x 8 kbit/sec
| [0], see who would win. It is uniquely depressing yet
| entertaining to see this outperform some applications.
|
| [0] spherical cow type back of the envelope estimate, don't
| take it too seriously; assumes a very fast 200 wpm speech, 5
| bytes per word, and everyone being able to independently
| progress
| luhn wrote:
| 8kbit/min, you mean.
| perching_aix wrote:
| Oh yeah lol, whoops. Still applies sadly.
| iso1631 wrote:
| Or colocate your bare metal in two or three data centres for
| resilience against environmental issues and single supplier.
| scotty79 wrote:
| > This happens so often that the S3 VPC endpoint should be
| setup by default when your VPC is created.
|
| It's a free service after all.
| coredog64 wrote:
| If you use the AWS console, it's a tick box to include this.
| MrDarcy wrote:
| No professional engineer uses the AWS console to provision
| foundational resources like VPC networks.
| wulfstan wrote:
| Yes, this. You lock it into Terraform or some equivalent.
|
| And ok, this is a mistake you will probably only make once
| - I know, because I too have made it on a much smaller
| scale, and thankfully in a cost-insensitive customer's
| account - but surely if you're an infrastructure provider
| you want to try to ensure that you are vigilantly removing
| footguns.
| kikimora wrote:
| Especially true now with Claude generating decent terraform
| code. I was shocked how good it is at knowing AWS gotchas.
| It also debug connectivity issues almost automagically.
| While I hate how it writes code I love how it writes
| terraform.
| shepherdjerred wrote:
| AI is surprising good at boilerplate IaC stuff. It's a
| great argument for configuration as code, or really just
| being able to represent things in plain text formats
| Spivak wrote:
| The reason to not include the endpoint by default is because
| VPCs should be secure by default. Everything is denied and
| unless you explicitly configure access to the Internet, it's
| unreachable. An attacker who manages to compromise a system in
| that VPC now has a means of data exfiltration in an otherwise
| air gapped set up.
|
| It's annoying because this is by far the more uncommon case for
| a VPC, but I think it's the right way to structure, permissions
| and access in general. S3, the actual service, went the other
| way on this and has desperately been trying to reel it back for
| years.
| SOLAR_FIELDS wrote:
| There's zero reason why AWS can't pop up a warning if it
| detects this behavior though. It should clearly explain the
| implications to the end user. I mean EKS has all sorts of
| these warning flags it pops up on cluster health there's
| really no reason why they can't do the same here.
| Spivak wrote:
| I am 100% in agreement, they could even make adding
| endpoints part of the VPC creation wizard.
| otterley wrote:
| It's already in there!
| Spivak wrote:
| Fantastic! Shows how long it's been since I've made a VPC
| by clicking around in the GUI.
| snoman wrote:
| The second someone doesn't pay attention to that warning
| and suffers an exfiltration, like the cap1 s3 incident,
| it's aws' fault as far as the media is concerned.
| mystifyingpoi wrote:
| To be fair, while EKS warnings are useful, I've grown a
| habit to ignore them completely, since I've seen every
| single RDS cluster littered with "create a read replica
| please" and "enable performance insights" bs warnings.
| wulfstan wrote:
| Right, I can appreciate that argument - but then the right
| thing to do is to block S3 access from AWS VPCs until you
| have explicitly confirmed that you want to pay the big $$$$
| to do so, or turn on the VPC endpoint.
|
| A parallel to this is how SES handles permission to send
| emails. There are checks and hoops to jump through to ensure
| you can't send out spam. But somehow, letting DevOps folk
| shoot themselves in the foot (credit card) is ok.
|
| What has been done is the monetary equivalent of "fail
| unsafe" => "succeed expensively"
| unethical_ban wrote:
| I don't get your argument. If an ec2 needs access to an s3
| resource, doesn't it need that role? Or otherwise, couldn't
| there be some global s3 URL filter that automagically routes
| same-region traffic appropriately if it is permitted?
|
| My point is that, architecturally, is there ever in the
| history of AWS an example where a customer wants to pay for
| the transit of same-region traffic when a check box exists to
| say "do this for free"? Authorization and transit/path are
| separate concepts.
|
| There has to be a better experience.
| icedchai wrote:
| The EC2 needs credentials, but not necessarily a role. If
| someone is able to compromise an EC2 instance that has
| unrestricted S3 connectivity (no endpoint policies), they
| could use their own credentials to exfiltrate data to a
| bucket not associated with the account.
| unethical_ban wrote:
| I'll have to dive in and take a look. I'm not arguing,
| but here is how I naively see it:
|
| It seems there is a gap between "how things are" and "how
| things should be".
|
| "Transiting the internet" vs. "Cost-free intra-region
| transit" is an entirely different question than "This EC2
| has access to S3 bucket X" or "This EC2 does not have
| access to S3 bucket X".
|
| Somewhere, somehow, that fact should be exposed in the
| design of the configuration of roles/permissions/etc. so
| that enabling cost-free intra-region S3 access does not
| implicitly affect security controls.
| cowsandmilk wrote:
| S3 Gateway endpoints break cross-region S3 operations. Changing
| defaults will break customers.
| deanCommie wrote:
| Changing defaults doesn't have to mean changing _existing_
| configurations. It can be the new default for newly created
| VPCs after a certain date, or for newly created accounts
| after a certain date.
|
| And if there are any interoperability concerns, you offer an
| ability to opt-out with that (instead of opting in).
|
| There is precedent for all of this at AWS.
| richwater wrote:
| > Changing defaults doesn't have to mean changing existing
| configurations. It can be the new default for newly created
| VPCs after a certain date, or for newly created accounts
| after a certain date.
|
| This is breaking existing IAAC configurations because they
| rely on the default. You will never see the change you're
| describing except in security-related scenarios
|
| > There is precedent for all of this at AWS.
|
| Any non-security IAAC default changes you can point to?
| belter wrote:
| AWS is not going to enable S3 endpoints by default, and most of
| the thread is downvoting the correct explanations like thinking
| in terms of a small hobby VPC, not the architectures AWS
| actually has to support.
|
| Why it should not be done:
|
| 1. It mutates routing. Gateway Endpoints inject prefix-list
| routes into selected route tables. Many VPCs have dozens of RTs
| for segmentation, TGW attachments, inspection subnets, EKS-
| managed RTs, shared services, etc. Auto-editing them risks
| breaking zero-trust boundaries and traffic-inspection paths.
|
| 2. It breaks IAM / S3 policies. Enterprises commonly rely on
| aws:sourceVpce, aws:SourceIp, Private Access Points, SCP
| conditions, and restrictive bucket policies. Auto-creating a
| VPCE would silently bypass or invalidate these controls.
|
| 3. It bypasses security boundaries. A Gateway Endpoint forces
| S3 traffic to bypass NAT, firewalls, IDS/IPS, egress proxies,
| VPC Lattice policies, and other mandatory inspection layers.
| This is a hard violation for regulated workloads.
|
| 4. Many VPCs must not access S3 at all. Air-gapped, regulated,
| OEM, partner-isolated, and inspection-only VPCs intentionally
| block S3. Auto-adding an endpoint would break designed
| isolation.
|
| 5. Private DNS changes behavior. With Private DNS enabled, S3
| hostname resolution is overridden to use the VPCE instead of
| the public S3 endpoint. This can break debugging assumptions,
| routing analysis, and certain cross-account access patterns.
|
| 6. AWS does not assume intent. The VPC model is intentionally
| minimal. AWS does not auto-create IGWs, NATs, Interface
| Endpoints, or egress paths. Defaults must never rewrite user
| security boundaries.
| ElectricalUnion wrote:
| > Auto-editing them risks breaking zero-trust boundaries and
| traffic-inspection paths.
|
| How are you inspecting zero-trust traffic? Not at the
| gateway/VPC level, I hope, as naive DPI there will break
| zero-trust.
|
| If it breaks closed as it should, then it is working as
| intended.
|
| If it breaks open, guess it was just useless pretend-zero-
| trust security theatre then?
| wulfstan wrote:
| These are all good arguments. Then do the opposite and block
| S3 access from VPCs by default. That would violate none of
| those.
|
| "We have no idea what your intent is, so we'll default to
| routing AWS-AWS traffic expensively" is way, way worse than
| forcing users to be explicit about their intent.
|
| Minimal is a laudable goal - but if a footgun is the result
| then you violate the principle of least surprise.
|
| I rather suspect the problem with issues like this is that
| they mainly catch the less experienced, who aren't an AWS
| priority because they aren't where the Big Money is.
| the8472 wrote:
| Or go IPv6 and use an egress gateway instead.
|
| https://docs.aws.amazon.com/vpc/latest/userguide/egress-only...
| patabyte wrote:
| > which every old-school Linuxer should be able to do in their
| sleep.
|
| Oof, this hit home, hah.
| tlaverdure wrote:
| Abolish NAT Gateways. Lean on gateway endpoints, egress only
| internet gateways with IPv6, and security groups to batten down
| the hatches. All free.
| agwa wrote:
| Now that AWS charges for public IPv4 addresses, is it still
| free if you need to access IPv4-only hosts?
| tlaverdure wrote:
| Yeah not free if you definitely need IPv4. AWS has been
| adding a lot more IPv6 support to their services so hopefully
| the trend continues in AWS and the broader industry. You can
| probably get pretty far though if your app doesn't have hard
| requirements to communicate with IPv4 only hots.
| whalesalad wrote:
| Wait till you encounter the combo of gcloud parallel composite
| uploads + versioning + soft-delete + multi-region bucket - and
| you have 500TB of objects stored.
| lapcat wrote:
| > AWS's networking can be deceptively complex. Even when you
| think you've done your research and confirmed the costs, there
| are layers of configuration that can dramatically change your
| bill.
|
| Unexpected, large AWS charges have been happening for so long,
| and so egregiously, to so many people, including myself, that we
| must assume it's by design of Amazon.
| lloydatkinson wrote:
| I can't see this as anything but on purpose
| AmbroseBierce wrote:
| Imagine a world were Amazon was forced to provide a publicly
| available report were they disclose how many clients have made
| this error -and similar ones- and how much money they have made
| from it. I know nothing like this will ever exist but hey, is
| free to dream.
| siliconc0w wrote:
| It used to be that you could whine to your account rep and they'd
| waive sudden accidental charges like this. Which we did regularly
| due to all the sharp edges. These days I gather it's a bit
| harder.
| cobolcomesback wrote:
| This wouldn't have specifically helped in this situation (EC2
| reading from S3), but on the general topic of preventing
| unexpected charges from AWS:
|
| AWS just yesterday launched flat rate pricing for their CDN
| (including a flat rate allowance for bandwidth and S3 storage),
| including a guaranteed $0 tier. It's just the CDN for now, but
| hopefully it gets expanded to other services as well.
|
| https://news.ycombinator.com/item?id=45975411
| jb_rad wrote:
| I did this when I was ~22 messing with infra for the first time.
| A $300 bill in two days when I had $2000 in the bank really
| stung. I love AWS for many things, but I really wish they made
| the cost calculations transparent for beginners.
| kevmo wrote:
| I wonder why they don't...
| mooreds wrote:
| Always always set up budget alarms.
|
| Make sure they go to an list with multiple people on it. Make
| sure someone pays attention to that email list.
|
| It's free and will save your bacon.
|
| I've also had good luck asking for forgiveness. One time I scaled
| up some servers for an event and left them running for an extra
| week. I think the damage was in the 4 figures, so not horrendous,
| but not nothing.
|
| An email to AWS support led to them forgiving a chunk of that
| bill. Doesn't hurt to ask.
| StratusBen wrote:
| Evergreen relevant blog post: "Save by Using Anything Other Than
| a NAT Gateway" https://www.vantage.sh/blog/nat-gateway-vpc-
| endpoint-savings
|
| Also as a shameless plug: Vantage covers this is exact type of
| cost hiccup. If you aren't already using it, we have a very
| generous free tier: https://www.vantage.sh/
| dylan604 wrote:
| Had the exact same thing happen. Only we used a company
| vetted/recommended by AWS to set this up for us, as we have no
| AWS experts and we're all too busy tasked doing actual startup
| things. So we staffed it out. Even the "professionals" get it
| wrong, and we racked up a huge expense as well. Staffed out
| company shrugged shoulders, and then just said sorry about your
| tab. We worked with AWS support to correct situation, and cried
| to daddy AWS account manager for a negotiated rate.
| maciekkmrk wrote:
| An entire blog article post to say "read the docs and enable VPC
| S3 endpoint".
|
| It's all in the docs:
| https://docs.aws.amazon.com/vpc/latest/privatelink/concepts....
|
| > _There is another type of VPC endpoint, Gateway, which creates
| a gateway endpoint to send traffic to Amazon S3 or DynamoDB.
| Gateway endpoints do not use AWS PrivateLink, unlike the other
| types of VPC endpoints. For more information, see Gateway
| endpoints._
|
| Even the first page of VPC docs:
| https://docs.aws.amazon.com/vpc/latest/userguide/what-is-ama...
|
| > _Use a VPC endpoint to connect to AWS services privately,
| without the use of an internet gateway or NAT device._
|
| The author of the blog writes:
|
| > _When you 're using VPCs with a NAT Gateway (which most
| production AWS setups do), S3 transfers still go through the NAT
| Gateway by default._
|
| Yes, you are using a virtual private network. Where is it
| supposed to go? It's like being surprised that data in your home
| network goes through a router.
| jairuhme wrote:
| > An entire blog article post to say "read the docs and enable
| VPC S3 endpoint".
|
| I think it's okay if someone missed something in the docs and
| wanted to share from their experience. In fact, if you look at
| the the s3 pricing page [0], under Data Transfer, VPC endpoints
| are mentioned at all. It simply says data transfer is free
| between AWS services in the same region. I think that much
| detail would be enough to reasonably assume you didn't have to
| set up additional items to accomplish.
|
| [0]https://aws.amazon.com/s3/pricing/
| kidsil wrote:
| Great write-up, thanks for sharing the numbers.
|
| I get pulled into a fair number of "why did my AWS bill explode?"
| situations, and this exact pattern (NAT + S3 + "I thought same-
| region EC2-S3 was free") comes up more often than you'd expect.
|
| The mental model that seems to stick is: S3 transfer pricing and
| "how you reach S3" pricing are two different things. You can be
| right that EC2-S3 is free and still pay a lot because all your
| traffic goes through a NAT Gateway.
|
| The small checklist I give people:
|
| 1. If a private subnet talks a lot to S3 or DynamoDB, start by
| assuming you want a Gateway Endpoint, not the NAT, unless you
| have a strong security requirement that says otherwise.
|
| 2. Put NAT on its own Cost Explorer view / dashboard. If that
| line moves in a way you didn't expect, treat it as a bug and go
| find the job or service that changed.
|
| 3. Before you turn on a new sync or batch job that moves a lot of
| data, sketch (I tend to do this with Mermaid) "from where to
| where, through what, and who charges me for each leg?" It takes a
| few minutes and usually catches this kind of trap.
|
| Cost Anomaly Detection doing its job here is also the underrated
| part of the story. A $1k lesson is painful, but finding it at
| $20k is much worse.
| blutoot wrote:
| Regardless of the AWS tech in question (and yes VPCE for non-
| compute services is a very common pattern in an enterprise setup
| using AWS since VPC with NAT is a pretty fundamental
| requirement), I honestly believe this was the biggest miss from
| the author: "Always validate your assumptions. I thought "EC2 to
| S3 is free" was enough. I should have tested with a small amount
| of data and monitored the costs before scaling up to terabytes."
| To me this is a symptom of DevOps/infra engineers being too much
| in love with infra automation without actually testing the full
| end to end flow.
| citizenpaul wrote:
| Its staggering to me that after all this time there are somehow
| still people in potions like this that are working without basic
| cost monitoring alerts on cloud/SaaS services
|
| It really shows the Silicon Vally disconnect with the real world,
| where money matters.
| abujazar wrote:
| $1000 for 20 TB of data transfer sounds like fraud. You can get a
| VM instance with 20 TB included INTERNET traffic at Hetzner for
| EUR4.15.
| lowbloodsugar wrote:
| I'm sure NAT gateways exist purely to keep uninformed security
| "experts" at companies happy. I worked at a Fortune 500 company
| but we were a dedicated group building a cloud product on AWS.
| Security people demanded a NAT gateway. Why? "Because you need
| address translation and a way to prevent incoming connections".
| Ok. That's what an Internet Gateway is. In the end we deployed a
| NAT gateway and just didn't setup routes to it. Then just used
| security groups and public IPs.
| knowitnone3 wrote:
| That's a loophole AWS needs to close
| Fokamul wrote:
| The lesson: Don't use AWS
| throwawayffffas wrote:
| > The solution is to create a VPC Gateway Endpoint for S3. This
| is a special type of VPC endpoint that creates a direct route
| from your VPC to S3, bypassing the NAT Gateway entirely.
|
| The solution is to move your processing infrastructure to
| Hetzner.
___________________________________________________________________
(page generated 2025-11-19 23:00 UTC)