[HN Gopher] AWS S3: Sometimes you should press the $100k button
___________________________________________________________________
AWS S3: Sometimes you should press the $100k button
Author : korostelevm
Score : 332 points
Date : 2022-02-17 10:32 UTC (12 hours ago)
(HTM) web link (www.cyclic.sh)
(TXT) w3m dump (www.cyclic.sh)
| dekhn wrote:
| I had to chuckle at this article because it reminded me of some
| of the things I've had to do to clean up data.
|
| One time I had to write a special mapreduce that did a multiple-
| step-map to converted my (deeply nested) directory tree into
| roughly equally sized partitions (a serial directory listing
| would have taken too long, and the tree was really unbalanced to
| partition in one step), then did a second mapreduce to map-delete
| all the files and reduce the errors down to a report file for
| later cleanup. This meant we could delete a few hundred terabytes
| across millions of files in 24 hours, which was a victory.
| valar_m wrote:
| Though it doesn't address the problem in TFA, I recommend setting
| up billing alerts in AWS. Doesn't solve their issue, but they
| would have at least known about it sooner.
| pontifier wrote:
| DON'T PRESS THAT BUTTON.
|
| The egress and early deletion fees on those "cheaper options"
| killed a company that I had to step in and save.
| pphysch wrote:
| On a related note, suppose the Fed raises rates to mitigate
| inflation and indirectly kills thousands of zombie companies,
| including many SaaS renting the cloud. What happens to their
| data? Does the cloud unilaterally evict/delete it, or does it
| get handled like an asset -- auctioned off, etc?
| cmckn wrote:
| I'm not aware of a cloud provider that is contractually
| allowed to do such a thing (except maybe alibaba by way of
| the CCP). Dying companies get purchased and have their assets
| pilfered every day, the same thing would happen with cloud
| assets.
| Uehreka wrote:
| > does it get handled like an asset -- auctioned off, etc?
|
| Who would buy that? I guess if this happened enough then
| people would start "data salvager" companies that specialize
| in going through data they have no schema for looking for a
| way to sell something of it to someone else. I have to
| imagine the margins in a business like that would be abysmal,
| and all the while you'd be in a pretty dark place ethically
| going through data that users never wanted you to have in the
| first place.
|
| Of course, all these questions are moot because if this
| happened the GDPR would nuke the cloud provider from orbit.
| Aeolun wrote:
| If they were already paying 100k per month for their storage, I
| doubt the additional 100k would severely impact their business.
|
| Proven by the fact that they happily went on to pay the bill
| for the next 6 months.
| charcircuit wrote:
| Can someone explain what happened in the end? From my
| understanding nothing happened (they deprioritizod the story for
| fixing it) and they are still blowing through the cloud budget.
| snowwrestler wrote:
| They didn't resolve the issue.
|
| There's an important moment in the story, where they realize
| the fix will incur a one-time fee of $100,000. No one in
| engineering can sign off on that amount, and no one wants to
| try to explain it to non-technical execs.
|
| They don't explain why. But it's probably because they expect a
| negative response like "how could you let this happen?!" or
| "I'm not going to pay that, find another way to fix it."
|
| In a lot of organizations it's easier to live with a steadily
| growing recurring cost than a one-time fee... even if the total
| of the steady growth ends up much larger than the one-time fee!
|
| It's not necessarily pathological. Future costs will be paid
| from future revenue; whereas a big fee has to be paid from cash
| on-hand now.
|
| But sometimes the calculation is not even attempted because of
| internal culture. When the decision is "keep your head down"
| instead of "what's the best financial strategy," that could
| hint at even bigger potential issues down the road.
| hogrider wrote:
| Sounds more like non technical leadership sleeping at the
| wheel. I mean if they could just afford to lose money like
| this why bother with all that work to fix it?
| seekayel wrote:
| How I read the article, nothing happened. I think it is a
| cautionary tale of why you should probably bite the bullet and
| press the button instead of doing the "easier" thing which ends
| up being harder and more expensive in the end.
| lloesche wrote:
| I had a similar issue at my last job. Whenever a user created a
| PR on our open source project artifacts of 1GB size consisting of
| hundreds of small files would be created and uploaded to a
| bucket. There was just no process that would ever delete
| anything. This went on for 7 years and resulted in a multi-
| petabyte bucket.
|
| I wrote some tooling to help me with the cleanup. It's available
| on Github:
| https://github.com/someengineering/resoto/tree/main/plugins/...
| consisting of two scripts, s3.py and delete.py.
|
| It's not exactly meant for end-users, but if you know your way
| around Python/S3 it might help. I build it for a one-off purge of
| old data. s3.py takes a `--aws-s3-collect` arg to create the
| index. It lists one or more buckets and can store the result in a
| sqlite file. In my case the directory listing of the bucket took
| almost a week to complete and resulted in a 80GB sqlite.
|
| I also added a very simple CLI interface (calling it virtual
| filesystem would be a stretch) that allows to load the sqlite
| file and browse the bucket content, summarise "directory" sizes,
| order by last modification date, etc. It's what starts when
| calling s3.py without the collect arg.
|
| Then there is delete.py which I used to delete objects from the
| bucket, including all versions (our horrible bucket was versioned
| which made it extra painful). On a versioned bucket it has to run
| twice, once to delete the file and once to delete the then
| created version, if I remember correctly - it's been a year since
| I built this.
|
| Maybe it's useful for someone.
| k__ wrote:
| What about the lifecycle stuff?
|
| I thought, S3 can move stuff to cheaper storage automatically
| after some time.
| lloesche wrote:
| Like I wrote for us it was a one-off job to find and remove
| 6+ year old build artifacts that would never be needed again.
| I just looked for the cheapest solution of getting rid of
| them. I couldn't do it by prefix alone (prod files mixed in
| the same structure as the build artifacts) which is why
| delete.py supports patterns (the `--aws-s3-pattern` arg takes
| a regex).
|
| If AWS' own tools work for you it's surely the better
| solution than my scripts. Esp. if you need something on an
| ongoing bases.
| coredog64 wrote:
| AWS has an inventory capability for S3:
| https://docs.aws.amazon.com/AmazonS3/latest/userguide/storag...
| wackget wrote:
| As a web developer who has never used anything except locally-
| hosted databases, can someone explain what kind of system
| actually produces billions or trillions of files which each need
| to be individually stored in a low-latency environment?
|
| And couldn't that data be stored in an actual database?
| abhishekjha wrote:
| An image service.
| wackget wrote:
| Yeah that use-case I get. Binary files which would be
| difficult/impractical to index in a database.
|
| However it feels like something at that scale will only ever
| realistically be dealt with by enterprise-level software, and
| I'd hazard a guess that _most_ developers - even those
| reading HN - are not working on enterprise-level systems.
|
| So I'm wondering what "regular devs" are using cloud buckets
| for at such a scale over regular DBs.
| rgallagher27 wrote:
| Things like mobile/webisite analytics events. User A clicked
| this menu item, User B viewed this images etc All streamed into
| S3 in chunks of smallish files.
|
| It's cheaper to store them in S3 over a DB and use tools like
| Athena or Redshift spectrum to query.
| wackget wrote:
| Wow. What makes it cheaper than using a DB? Is it just
| because the DB will create some additional metadata about
| each stored row or something?
| zmmmmm wrote:
| The rationale for using cloud is so often that it saves you from
| complexity. It really undermines the whole proposition when you
| find out that the complexity it shields you from is only skin
| deep, and in fact you still need a "PhD in AWS" anyway.
|
| But as a bonus, now you face huge risks and liabilities from
| single button pushes and none of those skills you learned are
| transferrable outside of AWS so you'll have to learn them again
| for gcloud, again for azure, again for Oracle ....
| Mave83 wrote:
| Just avoid the cloud. You get a Ceph storage with the performance
| of Amazon S3 at the price point of Amazon S3 Glacier in any
| Datacenter worldwide deployed if you want. There are companies
| that help you doing this.
|
| Feel free to ask if you need help.
| solatic wrote:
| TL-DR: Object stores are not databases. Don't treat them like
| one.
| wooptoo wrote:
| They're also _not_ classic hierarchical filesystems, but k-v
| stores with extras.
| throwaway984393 wrote:
| Try telling that to developers; they love using S3 as both a
| database and a filesystem. It's gotten to the point where we
| need a training for new devs to tell them what not to do in the
| cloud.
| Quarrelsome wrote:
| do you know if such sources exist publicly? I would be most
| interested in perusing recommended material on the subject.
| mst wrote:
| Honestly a Frequently Delivered Answers training for new
| developers is probably one of the best things you can include
| in onboarding.
|
| Every environment has its footguns, after all.
| hinkley wrote:
| Communicating through the filesystem is one of the Classic
| Blunders.
|
| It doesn't come up as often anymore since we generally have
| so many options at our fingertips, but when push comes to
| shove you will still discover this idea rattling around in
| people's skulls.
| ijlx wrote:
| Classic Blunders:
|
| 1. Never get involved in a land war in Asia
|
| 2. Never go in against a Sicilian when death is on the line
|
| 3. Never communicate through the filesystem
| solatic wrote:
| You can either train them with a calm tutorial or you can
| train them with angry billing alerts and shared-pain ex-post-
| facto muckraking.
|
| I, for one, prefer the calm way.
| lenkite wrote:
| _sigh_. My team is facing all these issues. Drowning in data.
| Crazy S3 bill spikes. And not just S3 - Azure, GCP, Alibaba, etc
| since we are a multi-cloud product.
|
| Earlier, we couldn't even figure out lifecycle policies to expire
| objects since naturally every PM had a different opinion on the
| data lifecycle. So it was old-fashioned cleanup jobs that were
| scheduled and triggered when a byzantine set of conditions were
| met. Sometimes they were never met - cue bill spike.
|
| Thankfully, all the new data privacy & protection regulations are
| a _life-saver_. Now, we can blindly delete all associated data
| when a customer off-boards or trial expires or when data is no
| longer used for original purpose. Just tell the intransigent PM
| 's that we are strictly following govt regulations.
| candiddevmike wrote:
| Are you multi-cloud because your customers need you to be
| multi-cloud?
| CydeWeys wrote:
| The data protection regulations really are so freeing, huh.
| It's amazing to be able to delete all this stuff without
| worrying about having to keep it forever.
| jeff_vader wrote:
| In case of my previous employer it led to incredibly
| complicated encryption system. It took couple years to maybe
| implement in 10% of the system. Deleting any old data was
| rejected.
| stingraycharles wrote:
| How is encryption compliant? I've implemented GDPR data
| infrastructures twice now, and as far as I'm aware, the
| only way to be compliant with encryption is when you throw
| the decryption key away.
| aeyes wrote:
| Sometimes it might be a single field in a 1MB nested
| structure that you have to remove. So it gets encrypted
| when the whole structure gets stored and when the field
| is to be deleted you just throw away the key instead of
| modifying the entire 1MB just to remove a few kB.
| dylan604 wrote:
| If you're comparing gov't regulations to delete data to
| saving a few KB, then I think you're looking at this
| wrong.
| spelunker wrote:
| As mentioned, encrypt something and throw a way the key,
| often called "crypto shredding".
| stingraycharles wrote:
| Ahh I see, and that way you can quickly "remove" a whole
| lot of data by just removing the key, which makes for
| cheap operations, and/or more flexible workflow (you can
| periodically compact the database and remove entries for
| which you have no key).
|
| Is my understanding correct?
| hinkley wrote:
| I wonder sometimes if it would help if we collectively
| watched more anti-hoarding shows, in order to see how the
| consultants convince their customers they can get rid of
| stuff.
| mro_name wrote:
| humans started their first 300k years as nomads - storing
| was just impossible and decrufing happened by itself when
| moving along.
|
| So maybe that's why we're not good at it yet.
| hinkley wrote:
| Being a renter definitely kept me lighter for a long
| time.
|
| When you have to box things up over and over you find
| that the physical and mental energy around keeping it
| aren't adding up. I wonder if migrating from cloud to
| cloud would simulate this experience.
| Bayart wrote:
| Being a renter just taught me to batch my $STUFF I/O to
| minimize read-writes to disk and maximize available low-
| latency space. ie. fill my bags to the brim with shit I
| didn't plan using whenever I'd go to my parents'.
| travisgriggs wrote:
| Two space garbage collector in action right there. Maybe
| all things software need a "move it or lose it" impetus.
| Features in apps, old data, you name it. If you've gotta
| keep transferring/translating it, it would definitely
| pare things down.
| whimsicalism wrote:
| now this is a spin i havent heard before.
| hvs wrote:
| You haven't heard it because it's not spin, it's from an
| engineer's point of view. That's not the view you hear in
| the news when it comes to these things.
| alisonkisk wrote:
| Eh, Retention and Deletion are both pain for devs. Not
| having to care is the happy state.
| whimsicalism wrote:
| HN seems like an odd place to assume that people only
| hear about things from the news and aren't engineers
| themselves.
|
| i am a dev that has to deal with these regulations in my
| day to day. it is a pain, it is not freeing in any sense,
| and it makes my models worse.
|
| granted, i think there are good reasons for it, but it
| does not make my life easier for sure.
| jabroni_salad wrote:
| As a sysadmin I really wish you had. SO MANY problems have
| come to my desk because some dude 3 years ago did not
| consider retention or rotation and now I have to figure out
| what to do with a 4TB .txt that is apparently important.
| dylan604 wrote:
| Find out how important it is with a `mv 4TB.txt 4TB.old`
| type of things. See how many people come screaming
| briffle wrote:
| "You never know when you might need this info to debug"
| The developer says as their cronjob creates a 250MB csv
| file, and a few MB of debug logs per day, for the past
| few years. "Disk is cheap" they say.
|
| As a sysadmin, I hate that too.
| whimsicalism wrote:
| sometimes the data is just big...
| colechristensen wrote:
| Often a considerable portion of those logs are useless,
| trace level misclassified as info, kept for years for no
| reason.
|
| You should keep a minimal set of logs necessary for
| audit, logs for errors which are actually errors, and
| logs for things which happen unexpectedly.
|
| What people do keep are logs for everything which
| happens, almost all of which is never a surprise.
|
| One needs to go through logs periodically and purge the
| logging code for every kind of message which doesn't
| spark joy, I mean seem like it would ever be useful to
| know.
| whimsicalism wrote:
| sure, in a world where machine learning doesnt exist i
| would agree with you. for low level logs of things like
| "memory low, spawning a new container" i would also agree
| with you. not for user actions though (which is the topic
| closest to whats under discussion given what sort of data
| these regulations cover)
| theshrike79 wrote:
| Yep, having everything disappear at 2 months max is a life-
| saver.
|
| That "absolutely essential thing" isn't essential any more
| when there is a possible GDPR/CCPA violation with a
| significant fine just around the corner.
| koolba wrote:
| Just make sure you actually test your backups. Two months
| of unusable backups are just as useful as no backups.
| marcosdumay wrote:
| Well, you should have done this before GDPR too, but
| reminding people to test backups is never too late and
| never too often.
| StratusBen wrote:
| Disclosure: I'm Co-Founder and CEO of a cloud cost company
| named https://www.vantage.sh/ - I also used to be on the
| product management team at AWS and DigitalOcean.
|
| I'm not intentionally trying to shill but this is exactly why
| people choose to use Vantage. We give them a set of features
| for automating and understanding what they can do to manage and
| save on costs. We're also adding multi-cloud support (GCP is in
| early access, Azure is coming) to be a single pane of glass
| into cloud costs.
|
| If anyone needs help on this stuff, I really love it. We have a
| generous free tier and free trial. We also have a Slack
| community of ~400 people nerding out on cloud costs.
| imwillofficial wrote:
| I work on a team the computes bills, shoot me a slack invite
| and perhaps I can offer insight.
| [deleted]
| cookiesboxcar wrote:
| I love vantage. Thank you for making it.
| samlambert wrote:
| Vantage is a seriously awesome product. We love it at
| PlanetScale. Obviously being a cloud product things can get
| pricy and so Vantage is essential.
| vdm wrote:
| https://www.google.com/search?q=site%3Ahttps%3A%2F%2Fdocs.va.
| ..
|
| I gave vantage.sh 5 minutes and did not see anything for S3
| that is not already available from the built-in Cost
| Explorer, Storage Lens, Cost and Usage Reports, and taking 1
| hour to study the docs https://docs.aws.amazon.com/AmazonS3/l
| atest/userguide/Bucket...
|
| Most "cloud optimisation" products want to tell you which EC2
| instance type to use, but can't actually give actionable
| advice for S3. Happy to be corrected on this.
| StratusBen wrote:
| We are in process of updating the documentation because
| you're right that it needs more work. For the record, if
| you're doing everything on your own via Cost Explorer,
| Storage Lens and processing CUR you may be set. From what
| we hear, most folks do not want to deal with processing CUR
| (or even know what it is) and struggle with Cost Explorer.
|
| Vantage automates everything you just mentioned to allow
| you to make quicker decisions. Here's a screenshot of what
| we do on a S3 Bucket basis: https://s3.amazonaws.com/assets
| .vantage.sh/www/s3_example.pn...
|
| We'll profile storage classes and the number of objects,
| tell you the exact cost of turning on things like
| intelligent tiering and how much that will cost with
| specific potential savings. This is all done out of the
| box, automatically - and we profile List/Describe APIs to
| always discover newly created S3 Buckets.
|
| From speaking with hundreds of customers, I can also assure
| you that at a certain scale, billing does not take an
| hour...there are entire teams built around this at larger
| companies.
| simonw wrote:
| Saving people from learning how to use Cost Explorer,
| Storage Lens, Cost and Usage Reports - and then taking 1
| hour to study documentation - sounds to me like a
| legitimate market opportunity.
| alar44 wrote:
| Not really. Sometimes you actually have to understand
| things. If you're so concerned about your billing,
| someone on your team should probably invest a freaking
| hour to understand it. If that can't happen, you are just
| setting yourself up for failure.
| Jgrubb wrote:
| I've been learning the ins and outs of the major 3
| providers cloud billing setups for the last year, and I'm
| just getting started. This is not a 1 hour job, but
| you're right that someone in your team needs to
| understand it.
| llbeansandrice wrote:
| At my last job we had a team spend an entire quarter just
| to help visualize and properly track all of our AWS
| expenditures. It's a huge job.
| beberlei wrote:
| Not confused why there is talk about software developer
| shortage when it seems a good amount of them work on this
| kind of nonsense. Talk about bs jobs.
| banku_brougham wrote:
| its a lot more than an hour, in my experience
| jopsen wrote:
| One of the biggest pains is that cloud services rarely mention
| what they don't do.
|
| I think it's really sad, because when I don't see docs clearly
| stating the limits, I assume the worst and avoid the service.
| pattycake23 wrote:
| Here's an article about Shopify running into the S3 prefix rate
| limit too many times, and tackling it:
| https://shopify.engineering/future-proofing-our-cloud-storag...
| sciurus wrote:
| Their solution was to introduce entropy into the beginning of
| the object names, which used to be AWS's recommendation for how
| to ensure objects are placed in different partitions. AWS
| claims this is no longer necessary, although how their new
| design actually handles partitioning is opaque.
|
| "This S3 request rate performance increase removes any previous
| guidance to randomize object prefixes to achieve faster
| performance. That means you can now use logical or sequential
| naming patterns in S3 object naming without any performance
| implications."
|
| https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3...
| pattycake23 wrote:
| Seems like it's a much higher rate limit, but it exists none
| the less, and Shopify's scale has also grown significantly
| since 2018 (when that article was written) - so it was
| probably a valid way for them to go.
| sciurus wrote:
| I think two things happened that are covered in that blog
| post
|
| 1) The performance per partition increased
|
| 2) The way AWS created partitions changed
|
| When I was at Mozilla, one thing I worked on was Firefox's
| crash reporting system. It's S3 storage backend wrote raw
| crash data with the key in the format
| `{prefix}/v2/{name_of_thing}/{entropy}/{date}/{id}`. If I
| remember correctly, we considered this a limitation since
| the entropy was so far down in the key. However, when we
| talked to AWS Support they told us their was no longer a
| need to have the entropy early on; effectively S3 would
| "figure it out" and partition as needed.
|
| EDIT: https://news.ycombinator.com/item?id=30373375 is a
| good related comment.
| ebingdom wrote:
| I'm confused about prefixes and sharding:
|
| > The files are stored on a physical drive somewhere and indexed
| someplace else by the entire string app/events/ - called the
| prefix. The / character is really just a rendered delimiter. You
| can actually specify whatever you want to be the delimiter for
| list/scan apis.
|
| > Anyway, under the hood, these prefixes are used to shard and
| partition data in S3 buckets across whatever wires and metal
| boxes in physical data centers. This is important because prefix
| design impacts performance in large scale high volume read and
| write applications.
|
| If the delimiter is not set at bucket creation time, but rather
| can be specified whenever you do a list query, how can the prefix
| be used to influence where objects are physically stored? Doesn't
| the prefix depend on what delimiter you use? How can the sharding
| logic know what the prefix is if it doesn't know the delimiter in
| advance?
|
| For example, if I have a path like
| `app/events/login-123123.json`, how does S3 know the prefix is
| `app/events/` without knowing that I'm going to use `/` as the
| delimiter?
| twistedpair wrote:
| This is where GCP's GCS (Google Cloud Storage) shines.
|
| You don't need to mess with prefixing all your files. They auto
| level the cluster for you [1].
|
| [1] https://cloud.google.com/storage/docs/request-
| rate#redistrib...
| kristjansson wrote:
| S3 does too, now.
|
| https://aws.amazon.com/about-aws/whats-
| new/2018/07/amazon-s3...
| inopinatus wrote:
| There's no delimiter. There is only the appearance of a
| delimiter, to appease folks who think S3 is a filesystem, and
| fool them into thinking they're looking at folders.
|
| The object name is the entire label, and every character is
| equally significant for storage. When listing objects, a prefix
| filters the list. That's all. However, S3 also uses substrings
| to partition the bucket for scale. Since they're anchored at
| the start, they're also called prefixes.
|
| In my view, it's best to think of S3's object indexing as a
| radix tree.
|
| This article, as if you couldn't guess from the content, is
| written from a position of scant knowledge of S3, not
| surprising it misrepresents the details.
| charcircuit wrote:
| >There's no delimiter.
|
| What's the delimiter parameter for then?
|
| https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObje.
| ..
| inopinatus wrote:
| To help you fool yourself. It affects how object list
| results are presented in the api response.
| ec109685 wrote:
| They can't present a directory abstraction for list
| operations without a delimiter. E.g. CommonPrefixes.
| throwhauser wrote:
| "To help you fool yourself" seems like a euphemism for
| "to fool you". It's gotta be tough to go from "scant
| knowledge of S3" to genuine knowledge if the
| documentation is doing this to you.
|
| If the docs are misrepresenting the details, who can
| blame the author of the post?
| inopinatus wrote:
| The documentation is very clear on the purpose of the
| delimiter parameter.
|
| The OP does not read the docs, makes bad assumptions
| repeatedly throughout, and then reaps the consequences.
| nightpool wrote:
| To provide a consistent API response as part of the
| ListObjects call. It has nothing to do with the storage on
| disk.
| ebingdom wrote:
| So if I have a bunch of objects whose names are hashes like
| 2df6ad6ca44d06566cffde51155e82ad0947c736 that I expect to
| access randomly, is there any performance benefit to
| introducing artificial delimiters like
| 2d/f6/ad6ca44d06566cffde51155e82ad0947c736? I've seen this
| used in some places.
| jstarfish wrote:
| I don't know what impact that partitioning pattern has on
| s3, but it has some obvious benefits if your app needs to
| revert to write to a normal filesystem instead (like for
| testing).
| dale_glass wrote:
| To AWS S3, '/' isn't a delimiter, it's a character that's
| part of the filename.
|
| So for instance "/foo/bar.txt" and "/foo//bar.txt" are
| different files in S3, even though they'd be the same file
| in a filesystem.
|
| This gets pretty fun if you want to mirror a S3 structure
| on-disk, because the above suddenly causes a collision.
| elcomet wrote:
| No difference other than readability. And amazon may
| distribute your application with another prefix anyway,
| like "2d/f6/ad6c"
| korostelevm wrote:
| AWS does the optimizations over time based on access patterns
| for the data. Should have made that clearer in the article.
|
| The problem becomes unusual burst load - usually from
| infrequent analytics jobs. The indexing cant respond fast
| enough.
| ebingdom wrote:
| Thanks for the clarification. But now I'm confused about the
| limits:
|
| > 3,500 PUT/COPY/POST/DELETE requests per second per prefix
|
| > 5,500 GET/HEAD requests per second per prefix
|
| Most of those APIs don't even take a delimiter. So for these
| limits, does the prefix get inferred based on whatever
| delimiter you've used for previous list requests? What if
| you've used multiple delimiters in the past?
|
| Basically what I'm trying to determine is whether these
| limits actually mean something concrete (that I can use for
| capacity planning etc.), or whether their behavior depends on
| heuristics that S3 uses under the hood.
|
| I'm fine with S3 optimizing things under the hood based on
| access my patterns, but not if it means I can't reason about
| these limits as an outsider.
| ec109685 wrote:
| Delimiter isn't used for writes, only list operations.
|
| S3 simply looks at the common string prefixes in your
| object names and uses that to internally shard objects, so
| you can achieve a multiple of those request limits.
|
| aaa122348
|
| aaa484585
|
| bbb484858
|
| bbb474827
|
| Would have same performance as:
|
| aaa/122348
|
| aaa/484585
|
| bbb/484858
|
| bbb/474827
| Macha wrote:
| S3 does a lot of under the hood optimisation. e.g. Create a
| brand new bucket, leave it cold for a while, and start
| throwing 100 PUT requests a second at it. This is way less
| than the advertised 3500, but they'll have scaled the
| allocated resources down so much you'll get some
| TooManyRequests errors.
| acdha wrote:
| Those are what I would assume for performance when the
| system is stable. The concerns come from bursty behaviour
| -- for example, if you put something new into production
| you might have a period of time while S3 is adjusting
| behind the scenes where you'll get transient errors from
| some operations before it stabilizes (these have almost
| always been resolved by retry in my experience). This is
| reportedly something your AWS TAM can help with if you know
| in advance that you're going to need to handle a ton of
| traffic and have an idea of what the prefix distribution
| will be like -- apparently the S3 support team can optimize
| the partitioning for you in preparation.
| xyzzy_plugh wrote:
| The prefix isn't delimited, it's an arbitrary length based on
| access patterns.
|
| A fictitious example which is close to reality:
|
| In parallel, you write a million objects each to:
| tomato/red/... tomato/green/...
| tomatoes/colors/...
|
| The shortest prefixes that evenly divides writes are thus
| tomato/r tomato/g tomatoes
|
| If you had an existing access pattern of evenly writing to
| tomatoes/colors/... bananas/...
|
| The shortest prefixes would be t b
|
| So suddenly writing 3 million objects that begin with a t would
| cause an uneven load or hotspot on the backing shards. The
| system realizes your new access pattern and determines new
| prefixes and moves data around to accommodate what it thinks
| your needs are.
|
| --
|
| The delimiter is just a wildcard option. The system is just a
| key value store, essentially. Specifying a delimiter tells the
| system to transform delimiters at the end of a list query like
| my/path/
|
| into a pattern match like my/path/[^/]+/?
| stepchowfun wrote:
| Thank you! This is the first explanation that I think fully
| explains what I was confused about. So essentially the prefix
| is just the first N bytes of the object's name, where N is a
| per-bucket number that S3 automatically decides and adjusts
| for you. And it has nothing to do with delimiters.
|
| I find the S3 documentation and API to be really confusing
| about this. For example, when listing objects, you get to
| specify a "prefix". But this seems to be not directly related
| to the automatically-determined prefix length based on your
| access patterns. And [1] says things like "There are no
| limits to the number of prefixes in a bucket.", which makes
| no sense to me given that the prefix length is something that
| S3 decides under the hood for you. Like, how do you even know
| how many prefixes your bucket has?
|
| [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/opt
| imi...
| inopinatus wrote:
| It is related, in the sense both "prefixes" are a substring
| match anchored at the start of the object name. They're
| just not the same mechanism.
| xyzzy_plugh wrote:
| The sharding key is an implementation detail, so you're not
| supposed to care about it too much.
| kristjansson wrote:
| That's true now. Used to be the case that they'd
| recommend random or high-entropy parts of the keys go at
| the beginning to avoid overloading a shard as you
| described above.
|
| From [0]:
|
| > This S3 request rate performance increase removes any
| previous guidance to randomize object prefixes to achieve
| faster performance. That means you can now use logical or
| sequential naming patterns in S3 object naming without
| any performance implications. This improvement is now
| available in all AWS Regions. For more information, visit
| the Amazon S3 Developer Guide.
|
| [0]: https://aws.amazon.com/about-aws/whats-
| new/2018/07/amazon-s3...
| wodenokoto wrote:
| I've never been in this situation, but I do wish you could query
| files with more advanced filters on these blob storage services.
|
| - But why SageMaker?
|
| - Why do some orgs choose to put almost everything in 1 buckets?
| tyingq wrote:
| >Why do some orgs choose to put almost everything in 1 buckets?
|
| The article seems to be making the case it's because the
| delimiter makes it seem like there's a real hierarchy. So the
| ramifications of /bucket/1 /bucket/2 versus /bucket1/ /bucket2/
| aren't well known until it's too late.
| charcircuit wrote:
| >So the ramifications of /bucket/1 /bucket/2 versus /bucket1/
| /bucket2/ aren't well known until it's too late.
|
| What's the difference?
| musingsole wrote:
| In the choice between a single bucket with hierarchical
| paths versus multiple buckets, there's a long list of
| nuances between either strategy.
|
| For the purposes of this article, you can probably have
| more intuitive, sensible lifecycle policies across multiple
| buckets than you can trying to set policies on specific
| paths within a single bucket. Something like
| "ShortLifeBucket" and "LongLifeBucket" would allow you to
| have items with similar prefixes (something like a
| "{bucket}/anApplication/file1.csv" in each bucket) that
| then have different lifecycle policies
| liveoneggs wrote:
| 1 athena?
|
| 2 some jobs make a lot of data
| korostelevm wrote:
| For many at orgs like this, SageMaker is probably the shortest
| path to an insane amount of compute with a python terminal.
|
| Why single bucket? Once someone refers to a bucket as "the"
| bucket - it is how it will forever be.
| akdor1154 wrote:
| > But why SageMaker?
|
| You could ask the same thing of most times it gets used for ML
| stuff as well.
|
| > Why do some orgs choose to put almost everything in 1
| buckets?
|
| Anecdote: ours does because we paid (Multinational Consulting
| Co)(tm) a couple of million to design our infra for us, and
| that's what the result was.
| liveoneggs wrote:
| I have caused billing spikes like this before those little
| warnings were invented and it was always a dark day. They are
| really a life saver.
|
| Lifecycle rules are also welcome. Writing them yourself was
| always a pain and tended to be expensive with list operations
| eating up that api calls bill.
|
| ----
|
| Once I supported an app that dumped small objects into s3 and
| begged the dev team to store the small objects in oracle as BLOBS
| to be concatenated into normal-sized s3 bjects after a reasonable
| timeout where no new small objects would reasonably be created.
| They refused (of course) and the bills for managing a bucket with
| millions and millions of tiny objects were just what you expect.
|
| I then went for a compromise solution asking if we could stitch
| the small objects together after a period of time so they would
| be eligible for things like infrequent access or glacier but,
| alas, "dev time is expensive you know" so N figure s3 bills
| continue as far as I know.
| sharken wrote:
| I suppose it's not just dev time on the line, but also the risk
| of doing the change that is thought to be too high.
|
| If I ever get to be a manager I'd go for an idea such as yours.
| Though I suspect too many managers are too far removed from the
| technical aspect of things and don't listen nearly enough.
| vdm wrote:
| The warning should say "you have N million objects technically
| eligible for an archive storage class and hitting the button to
| transition them will cost $M".
|
| Also S3 should no-op transitions for objects smaller than the
| break-even size for each storage class, even if you ask it to.
| darkwater wrote:
| > I then went for a compromise solution asking if we could
| stitch the small objects together after a period of time so
| they would be eligible for things like infrequent access or
| glacier but, alas, "dev time is expensive you know" so N figure
| s3 bills continue as far as I know.
|
| This hits home so hard that it hurts. In my case is not S3 but
| compute bills but the core concept is the same.
| WrtCdEvrydy wrote:
| Because the bill isn't a "dev problem". Once you move those
| bills to "devops", it becomes an infrastructure problem.
| zrail wrote:
| A big chunk of responsibility for teams doing cloud devops
| is cost attribution. Cloud costs are incurred by services
| and those services are owned by teams. Those teams should
| be billed for their costs and encouraged (via spiffs or the
| perf process if necessary) to manage them. Devops' job is
| to build the tooling that allows that to happen.
| [deleted]
| StratusBen wrote:
| On this topic, it's always surprising to me how few people even
| seem to know about different storage classes on S3...or even
| intelligent tiering (which I know carries a cost to it, but
| allows AWS to manage some of this on your behalf which can be
| helpful for certain use-cases and teams).
|
| We did an analysis of S3 storage levels by profiling 25,000
| random S3 buckets a while back for a comparison of Amazon S3 and
| R2* and nearly 70% of storage in S3 was StandardStorage which
| just seems crazy high to me.
|
| * https://www.vantage.sh/blog/the-opportunity-for-cloudflare-r...
| blurker wrote:
| I think that it's not just people not knowing about the
| lifecycle feature, but also that when they start putting data
| into a bucket they don't know what the lifecycle should be yet.
| Honestly I think overdoing lifecycle policies is a potentially
| bigger foot gun than not setting them. If you misuse glacier
| storage that will really cost you big $$$ quickly! And who
| wants to be the dev who deleted a bunch of data they shouldn't
| have?
|
| Lifecycle policies are simple in concept, but it's actually not
| simple to decide what they should be in many cases.
| rizkeyz wrote:
| I did the back-of-the-envelope math once. You get a Petabyte of
| storage today for $60K/year if you buy the hardware (retail
| disks, server, energy). It actually fits into the corner of a
| room. What do you get for $60K in AWS S3? Maybe a PB for 3 months
| (w/o egress).
|
| If you replace all your hardware every year, the cloud is 4x more
| expensive. If you manage to use your getto-cloud for 5 year, you
| are 20x cheaper than Amazon.
|
| To store one TB per person on this planet in 2022, it would take
| a mere $500M to do that. That's short change for a slightly
| bigger company these days.
|
| I guess by 2030 we should be able to record everything a human
| says, sees, hears and speaks in an entire life for every human on
| this planet.
|
| And by 2040 we should be able to have machines learning all about
| human life, expression and intelligence to slowly making sense of
| all of this.
| zitterbewegung wrote:
| I was at a presentation where HERE technologies told us that they
| went from being on the top ten (or top five) S3 users (by data
| stored) to getting off of that list. This was seen as a big deal
| obviously.
| harshaw wrote:
| AWS budgets is a tool for cost containment (among other external
| services).
| 0x002A wrote:
| Each time a developer does something on a cloud platform, that
| moment the platform might start to profit for two reasons: vendor
| lock-in and accrued costs in the long term regardless of the unit
| cost.
|
| Anything limitless/easiest has a higher hidden cost attached.
| Tehchops wrote:
| We've got data in S3 buckets not nearly at that scale and
| managing them, god forbid trying a mass delete, is absolute
| tedium.
| amelius wrote:
| Mass delete also takes an eternity on my Linux desktop machine.
|
| The filesystem is hierarchical, but the delete operation still
| needs to visit all the leaves.
| amelius wrote:
| (This is a good example where Garbage Collection wins over
| schemes which track reference explicitly, like reference
| counting. A garbage collector can just throw away the
| reference, while other schemes need to visit every leaf
| resulting in hours of deletion time in some cases.)
| sokoloff wrote:
| Is S3 actually hierarchical? I always took the mental model
| that the S3 object namespace within a bucket was flat and the
| treatment of '/' as different was only a convenient fiction
| presented in the tooling, which is consistent with the claim
| in this article.
| cle wrote:
| This is mostly correct, with the additional feature that S3
| can efficiently list objects by "key prefix" which helps
| preserve the illusion.
| sokoloff wrote:
| Followup question: Is there something special about the
| PRE notations in the example output below? I can list
| objects by _any_ textual prefix, but I can 't tell if the
| PRE (what we think of as folders) is more efficient than
| just the substring prefix.
|
| Full bucket list, then two text prefix, then an (empty)
| folder list sokoloff@ Downloads % aws s3
| ls s3://foo-asdf
| PRE bar-folder/ PRE baz-
| folder/ 2022-02-17 09:25:38 0 bar-
| file-1.txt 2022-02-17 09:25:42 0 bar-
| file-2.txt 2022-02-17 09:25:57 0 baz-
| file-1.txt 2022-02-17 09:25:49 0 baz-
| file-2.txt sokoloff@ Downloads % aws s3 ls
| s3://foo-asdf/ba PRE
| bar-folder/ PRE baz-
| folder/ 2022-02-17 09:25:38 0 bar-
| file-1.txt 2022-02-17 09:25:42 0 bar-
| file-2.txt 2022-02-17 09:25:57 0 baz-
| file-1.txt 2022-02-17 09:25:49 0 baz-
| file-2.txt sokoloff@ Downloads % aws s3 ls
| s3://foo-asdf/bar PRE
| bar-folder/ 2022-02-17 09:25:38 0 bar-
| file-1.txt 2022-02-17 09:25:42 0 bar-
| file-2.txt sokoloff@ Downloads % aws s3 ls
| s3://foo-asdf/bar-folder
| PRE bar-folder/
| jsmith45 wrote:
| Umm... that output seems confusing.
|
| The ListObjects api will omit all objects that share a
| prefix that ends in the delimiter, and instead put said
| prefix into the CommonPrefix element, which would be
| reflected as PRE lines. (So with a delimiter of '/', it
| basically hides objects in "subfolders", but lists any
| subfolders that match your partial text in the
| CommonPrefix element).
|
| By default `aws s3 ls` will not show any objects within a
| CommonPrefix but simply shows a PRE line for them. The
| cli does not let you specify a delimiter, it always uses
| '/'. To actually list all objects you need to use
| `--recursive`.
|
| The output there would suggest that bucket really did
| have object names that began with `bar-folder/`, and that
| last line did not list them out because you did not
| include the trailing slash. Without the trailing slash it
| was just listing objects and CommonPrefixes that match
| the string you specified after the last delimiter in your
| url. Since only that one common prefix matched, only it
| was printed.
| jrochkind1 wrote:
| I don't understand the answer to that question either.
| Other AWS docs says you can choose whatever you want for
| a delimiter, there's nothing special about `/`. So how
| does that apply to what they say about performance and
| "prefixes"?
|
| Here is some AWS documentation on it:
|
| https://docs.aws.amazon.com/AmazonS3/latest/userguide/opt
| imi...
|
| > For example, your application can achieve at least
| 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per
| second per prefix in a bucket. There are no limits to the
| number of prefixes in a bucket. You can increase your
| read or write performance by using parallelization. For
| example, if you create 10 prefixes in an Amazon S3 bucket
| to parallelize reads, you could scale your read
| performance to 55,000 read requests per second.
|
| Related to your question, even if we just stick to `/`
| because it seems safer, does that mean that
| "foo/bar/baz/1/" and "foo/bar/baz/2/" are two prefixes
| for the point of these request speed limits? Or does the
| "prefix" stop at the first "/" and files with these
| keypaths are both in the same "prefix" "foo/"?
|
| Note there was (according to docs) a change a couple
| years ago that I think some people haven't caught on to:
|
| > For example, previously Amazon S3 performance
| guidelines recommended randomizing prefix naming with
| hashed characters to optimize performance for frequent
| data retrievals. You no longer have to randomize prefix
| naming for performance, and can use sequential date-based
| naming for your prefixes.
| the8472 wrote:
| Most recursive deletion routines are not optimized for speed.
| This could be done much faster with multiple threads or
| batching the calls via io_uring.
|
| Another option are LVM or btrfs subvolumes which can be
| discarded without recursive traversal.
| res0nat0r wrote:
| Use the delete-objects instead and it will be much faster, as
| you can supply up to 1000 keys to remove per a single API
| call.
|
| https://awscli.amazonaws.com/v2/documentation/api/latest/ref.
| ..
| deepsun wrote:
| I believe it's mostly a problem of latency between your
| machine and S3. Since each Delete call is issued separately
| in its own HTTP connection.
|
| 1. Try parallelization of your calls. Deleting 20 objects in
| parallel should take the same time as deleting 1.
|
| 2. Try to run deletion from an AWS machine in the same region
| as the S3 bucket (yes buckets are regional, only their names
| are global). Within-datacenter latency should be lower than
| between your machine and datacenter.
| anderiv wrote:
| Set a lifecycle rule to delete your objects. Come back a day
| later and AWS will have taken care of this for you.
| ramraj07 wrote:
| The issue is this isn't free. I played and emended up with a
| few hundred million object S3 bucket on a personal project
| and am trying to get rid of it without getting a bill.
| Seriously considering just getting suspended from aws if
| that's a viable path lol.
| anderiv wrote:
| "You are not charged for expiration or the storage time
| associated with an object that has expired."
|
| From: https://docs.aws.amazon.com/AmazonS3/latest/userguide
| /lifecy...
| orf wrote:
| Lifecycle rules are free. Use them to empty the bucket.
| lnwlebjel wrote:
| Very true: it took me about a month of emptying, deleting and
| life cycling about a dozen buckets of about 20 TB (~20 million
| objects) to get to zero.
| vdm wrote:
| DeleteObjects takes 1000 keys per call.
|
| Lifecycle rules can filter by min/max object size. (since Nov
| 2021)
| electroly wrote:
| Thank you for mentioning that lifecycle rule change. I must
| have missed the announcement; that is exactly the functionality
| I needed.
| vdm wrote:
| Athena supports regexp_like(). By loading in an S3 inventory
| this can match what a wildcard would. Then a Batch Operations
| job can tag the result.
|
| Not easy, but is possible and effective.
| asim wrote:
| The AWS horror stories never cease to amaze me. It's like we're
| banging our heads against the wall expecting a different outcome
| each time. What's more frustrating, the AWS zealots are quite
| happy to tell you how you're doing it wrong. It's the users fault
| for misusing the service. The reality is, AWS was built for a
| specific purpose and demographic of user. It's now complexity and
| scale makes it unusable for newer devs. I'd argue, we need a
| completely new experience for the next generation.
| jiggawatts wrote:
| My theory is that single-platform clouds actually make more
| sense than trying to be everything for everyone. While the
| latter can scale to $billions, the former might actually have
| higher margins because it delivers more value.
|
| An example might be something like a Kubernetes-only cloud
| driven entirely by Git-ops. Not TFVC, or CVS, or Docker Swarm,
| or some hybrid of a proprietary cloud and K8s. Literally just a
| Git repo that materialises Helm charts onto fully managed K8s
| clusters. That's it.
|
| If you try to do anything similar in, say, Azure, you'll
| discover that:
|
| Their DevOps pipelines are managed by a completely separate
| product group and doesn't natively integrate into the platform.
|
| You now have K8s labels and Azure tags.
|
| You now have K8s logging and Azure logging.
|
| You now have K8s namespaces and Azure resource groups.
|
| You now have K8s IAM and Azure IAM.
|
| You now have K8s storage and Azure disks.
|
| Just that kind of duplication of concepts _alone_ can take this
| one system 's complexity to a level where it's impossible for a
| pure software development team to use without having a
| dedicated DevOps person!
|
| Azure App Service or AWS Elastic Beanstalk are similarly overly
| complex, having to bend over backwards to support scenarios
| like "private network integration". Yeah, that's what
| developers _want_ to do, carve up subnets and faff around with
| routing rules! /s
|
| For example, if you deploy a pre-compiled web app to App
| Service, it'll... compile it again. For compatibility with a
| framework you aren't using! You need a poorly documented
| environment variable flag to work around this. There's like a
| _dozen_ more like this and clocking up so fast.
|
| Developers just want a platform they can push code to and have
| it run with high availability and disaster recovery provided
| as-if turning on a tap.
| rmbyrro wrote:
| What do you see missing or not well explained in AWS
| documentation that newer devs wouldn't understand?
|
| I started using S3 early in my career and didn't see this
| problem. I always thought in data retention during design
| phase.
|
| My opinion is that lazy, careless or under time pressure
| developers will not, and then will get bitten. But it would
| happen to any tool. Maybe a different problem, but they'll
| always get bitten hard ...
| 015a wrote:
| I agree 110%.
|
| Actually, I disagree with one statement: "AWS was built for a
| specific purpose and demographic of user". AWS wasn't built for
| anyone. It was built for everyone, and is thus even reasonably
| productive for no one. AWS's entire product development
| methodology is "customer asks for this, build it"; there's no
| high level design, very few opinions, five different services
| can be deployed to do the same thing, it's absolute madness and
| getting worse every year. Azure's methodology is "copy whatever
| AWS is doing" (source: engineers inside Azure), so they inherit
| the same issues, which makes sense for Microsoft because
| they've always been an organization gone mad.
|
| If there's one guiding light for Big Cloud, its: they're built
| to be sold to people who buy cloud resources. I don't even feel
| this is entirely accurate, given that this demographic of
| purchaser should _at least_ , if nothing else, be considerate
| of the cost, and there's zero chance of Big Cloud winning that
| comparison without deceit, but if there was a demographic
| that's who it'd be.
|
| > I'd argue, we need a completely new experience for the next
| generation.
|
| Fortunately, the world is not all Big Cloud. The work
| Cloudflare is doing between Workers & Pages represents a
| _really_ cool and productive application environment. Netlify
| is another. Products like Supabase do a cool job of vendoring
| open source tech with traditional SaaS ease-of-use, with fair
| billing. DigitalOcean is also becoming big in the "easy cloud"
| space, between Apps, their hosted databases, etc. Heroku still
| exists (though I feel they've done a very poor job of
| innovating recently, especially in the cost department).
|
| The challenge really isn't in the lack of next-gen PaaS-like
| platforms; its in countering the hypnosis puked out by Big
| Cloud Sales in that they're the only "secure" "reliable"
| "whatever" option. This hypnosis has infected tons of otherwise
| very smart leaders. You ask these people "lets say we are at
| four nines now; how much are you willing to pay, per month, to
| reach five nines? and remember Jim, four-nines is one hour of
| downtime a year." No one can answer that. No one.
|
| End point being: anyone who thinks Big Cloud will reign supreme
| forever hasn't studied history. Enterprise contracts make it
| impossible for them to clean the cobwebs from their closets.
| They will eventually become the next Oracle or IBM, and the
| cycle repeats. It's not an argument to always run your own
| infra or whatever; but it _is_ an argument to lean on and
| support open source.
| jiggawatts wrote:
| > Azure's methodology is "copy whatever AWS is doing"
| (source: engineers inside Azure), so they inherit the same
| issues, which makes sense for Microsoft because they've
| always been an organization gone mad.
|
| I guess this, but it's funny to see it confirmed.
|
| I got suspicious when I realised Azure has many of the same
| bugs and limitations as AWS despite being supposedly
| completely different / independent.
| inopinatus wrote:
| That's just it, though: it isn't an AWS horror story. It's the
| sorcerer's apprentice.
| deanCommie wrote:
| HackerNews loves to criticize the cloud. It always reminds me
| of this infamous Dropbox comment:
| https://news.ycombinator.com/item?id=9224
|
| The cloud abstracts SO MUCH complexity from the user. The fact
| that people are then gleefully taking these "simple" services
| and overloading them with way too much data, and way too much
| complexity on top is not a failure of the underlying
| primitives, but a success.
|
| Without these cloud primitives, the people footgunning
| themselves with massive bills would just not have a working
| solution AT ALL.
| jiggawatts wrote:
| > The people footgunning themselves with massive bills would
| just not have a working solution AT ALL.
|
| Sometimes guard rails are a good thing, and the AWS
| philosophy has _very firmly_ been against guard rails,
| especially related to spending. The issue has come up here
| again and again that AWS refuses to add cost limits, even
| though they are capable of it. Azure _copied_ this
| limitation. I don 't mean that they didn't implement cost
| limits. They _did!_ The Visual Studio subscriber accounts
| have cost limits. I mean that they refused to allow anyone to
| use this feature in PayG accounts.
|
| Let me give you a practical example: If I host a blog on some
| piece of tin with a wire coming out of it, my $/month is not
| just predictable, but _constant_. There 's a cap on the
| outbound bandwidth, and a cap on compute spending. If my blog
| goes viral, it'll slow down to molasses, but my bank account
| will remain unmolested. If a DDoS hits it, it'll go down...
| and then come back up when the script kiddie get bored and
| move on.
|
| Hosting something like this on even the _most efficient_
| cloud-native architecture possible, such as a static site on
| an S3 bucket or Azure Storage Account is _wildly dangerous_.
| There is literally nothing I can do to stop the haemorrhaging
| if the site goes popular.
|
| Oh... set up some triggers or something... you're about to
| say, right? The billing portal has a _multi-day_ delay on it!
| You can bleed $10K per _hour_ and not have a clue that this
| is going on.
|
| And even if you know... then what? There's no "off" button!
| Seriously, try looking for "stop" buttons on anything that's
| not a VM in the public cloud. S3 buckets and Storage Accounts
| certainly don't have anything like that. At best, you could
| implement a firewall rule or something, but each and every
| service has a unique and special way of implementing a "stop
| bleeding!" button.
|
| I don't have time for this, and I can't wear the risk.
|
| This is why the cloud -- as it is right now -- is just too
| dangerous for most people. The abstractions it provides
| aren't just leaky, the hole has razor-sharp edges that has
| cut the hands of many people that think that it works just
| like on-prem but simpler.
| jollybean wrote:
| In this case it is absolutely the user 'doing it wrong'.
|
| AWS allows you to store gigantic amounts of data, thus lowering
| the bar dramatically for the kinds of things that we will keep.
|
| This invariably creates a different kind of problem when those
| thresholds are met.
|
| In this case, you have 'so much data you don't know what to do
| with it'.
|
| Akin to having 'really cheap warehouse storage space' that just
| gets filled up.
|
| "It's now complexity and scale makes it unusable for newer
| devs. I'"
|
| No - the 'complexity' bit is a bit of a problem, but not the
| scale.
|
| The 'complexity bit' can be overcome if you stick to some very
| basic things like running Ec2 instances and very basic security
| configs. Beyond that, yes it's hard. But the 'equivalent' of
| having your own infra would be simply to have a bunch of Ec2
| instances on AWS and 'that's it' - and that's essentially
| achievable without much fuss. That's always an option to small
| companies, i.e. 'just fun some instances' and don't touch
| anything else.
| dasil003 wrote:
| I'm not sure any sizable group is banging their head against a
| wall. Yes, AWS is complex. Yes, AWS has cost foot guns. These
| are natural outcomes of removing friction from scaling.
|
| Sure we could start with something simpler, but as you may have
| noticed, even the more basic hosting providers like
| DigitalOcean and Linode have been adding S3-compatible object
| storage because of its proven utility.
|
| In terms of making something meaningfully simpler, I think
| Heroku was the high water mark. But even though it was a great
| developer experience, the price/performance barriers were a lot
| more intractable than dealing with AWS.
| WaxProlix wrote:
| Heroku did so much right. I recently was toying with some bot
| frameworks (think Discord or IRC, nothing spammy or review-
| gaming) and getting everything set up on a free tier dyno
| with free managed sql backing it up, and a github test/build
| integration, all took an hour or so. Really exceeded my
| expectations.
|
| Not sure how it scales for production loads but my experience
| was so positive I'll probably go back for future projects.
| greiskul wrote:
| Yeah, heroku is absolutely the best in just getting
| something running. Truth is most projects don't ever have
| to scale, either because they are hobby projects, or cause
| they just fail. Heroku is the simplest platform that I know
| to just quickly test something. If you do find a good
| market fit and then need to scale, then sure, use some time
| to get out of it. But for proof of concepts, rapid
| iteration, etc. Heroku is awesome.
| ericpauley wrote:
| I'll argue that Fly.io is beginning to meet that need in
| a lot of ways, especially with managed Postgres now.
| marcosdumay wrote:
| > These are natural outcomes of removing friction from
| scaling.
|
| Yes, and making scaling frictionless brings a very tiny bit
| of value for everybody, but a huge amount of value for the
| cloud operator. Any bit of friction would completely remove
| that problem.
|
| Also, focusing on scaling before efficiency benefits nobody
| but the cloud provider.
| _jal wrote:
| > we need a completely new experience for the next generation
|
| I mean, at some point, if you're (say) using some insane amount
| of storage, you're going to pay for that.
|
| I would agree that getting alerting right for billing-relevant
| events _at whatever you 're currently operating at_ should be a
| lot easier than it is. And I agree that there is a lot of room
| to baby-proof some of the less obvious mistakes that people
| frequently make, to better expose the consequences of some
| changes, etc.
|
| But the flip side is that infra has always been expensive, and
| vendors have always been more than happy to sell you far more
| than you need along with the next new shiny whatever.
|
| To the extent that these are becoming implicit decisions made
| by developers rather than periodic budgeted refresh events
| built by infra architects, developers need to take
| responsibility for understanding the implications of what
| they're doing.
| ignoramous wrote:
| > _It 's the users fault for misusing the service._
|
| I believe, AWS' _usage_ -based billing make for long-tail
| surprises because its users are designing systems exactly as
| one would expect them to. For example, S3 is never meant for a
| bazillion small objects which Kinesis Firehose makes it easy to
| deliver to it. In such cases, dismal retrieval performance
| aside [0], the cost to list/delete dominate abnormally.
|
| We spin up a AWS Batch job every day to coalesce all S3 files
| ingested that day into large zlib'd parquets (kind of reverse
| _VACCUM_ as in postgres / _MERGE_ as in elasticsearch). This
| setup is painful. I guess the lesson here is, one needs to
| architect for both billing and scale, right from the get go.
|
| [0] https://news.ycombinator.com/item?id=19475726
| jwalton wrote:
| Your website renders as a big empty blue page in Firefox unless I
| disable tracking protection (and in my case, since I have
| noscript, I have to enable javascript for "website-files.com", a
| domain that sounds totally legit).
| mst wrote:
| I have tracking protection and ublock origin both enabled and
| it rendered fine (FF on Win10).
|
| (presented as a data point for any poor soul trying to
| replicate your problem)
| tazjin wrote:
| Chrome with uBlock Origin on default here, and it renders a big
| blue empty page for me, too. That's despite dragging in an
| ungodly amount of assets first.
|
| Here's an archive link that works without any tracking, ads,
| Javascript etc.: https://archive.is/F5KZd
| moffkalast wrote:
| Noscript breaking websites? Who woulda thunk.
|
| How do you manage to navigate the web with that on by default?
| It breaks just about everything since nothing is a static site
| these days.
| Sophira wrote:
| The problem is that the DIV that contains the main text has the
| attribute 'style="opacity:0"'. Presumably, this is something
| that the JavaScript turns off.
|
| A lot of sites like to do things like this for some reason. I
| haven't figured out why. I like to use Stylus to mitigate these
| if I can, rather than enabling JavaScript.
| test1235 wrote:
| mitigation for a flash of unstyled content (FOUC) maybe?
| ectopod wrote:
| A lot of these sites (including this one) do work in reader
| view.
| acdha wrote:
| This is a common anti-pattern -- I believe they're trying to
| ensure that the web fonts have loaded before the text
| displays but it's really annoying for mobile users since it
| can add up to 2.5 seconds (their timeout) to the time before
| you can start reading unless you're using reader mode at
| which point it renders almost instantly.
| [deleted]
| MattRix wrote:
| The page animates in. I have no idea why it does, but it
| does, which explains why the opacity starts at 0%.
| cj wrote:
| Off topic: for people with a "million billion" objects, does the
| S3 console just completely freeze up for you? I have some large
| buckets that I'm unable to even interact with via the GUI. I've
| always wondered if my account is in some weird state or if
| performance is that bad for everyone. (This is a bucket with
| maybe 500 million objects, under a hundred terabytes)
| hakube wrote:
| I have millions (about 16m PDF and text files) of objects and
| it's completely freezing
| albert_e wrote:
| I suggest you raise a support ticket.
|
| AFAIK there is server-side paging implemented in the List* API
| operations that the Console UI should be using so that the
| number of objects in a bucket should not significantly impact
| the webpage performance.
|
| But who knows what design flaws lurk beneath the console.
|
| Curious to know what you find.
|
| Does it happen only on opening heavy buckets? or the entire S3
| console? Different Browser / incognito / different machine
| ...dont make a difference?
| liveoneggs wrote:
| the newer s3 console works a little better. It gives pagination
| with "< 1 2 3 ... >"
| kristjansson wrote:
| Just checked, out of curiosity. A bucket at $WORK with ~4B
| objects / ~100TB is completely usable through the console. Keys
| are hierarchal, and relatively deep, so no one page on the GUI
| is trying to show more than a few hundred keys. If your keys
| are flatter, I could see how the console be unhappy.
| grumple wrote:
| Sort of related, I faced such an issue when I had a gui table
| that was triggering a count on a large object set via sql so it
| could display the little "1 to 50 of 1000000". This is
| presumably why services like google say "of many". Wonder if
| they have a similar issue.
| base698 wrote:
| Yes, and sometimes even listing can take days.
|
| I worked somewhere that a person decided using Twitter Firehose
| was a good idea for S3. Keyed by tweet per file.
|
| Ended up figuring out a way to get them in batches and
| condense. Ended up costing about $800 per hour to fix coupled
| with lifecycle changes they mentioned.
| properdine wrote:
| Doing an S3 object inventory can be a lifesaver here!
| orf wrote:
| > Yes, and sometimes even listing can take days.
|
| You have a versioned bucket with a lot of delete markers in
| it. Make sure you've got a lifecycle policy to clean them up.
| CobrastanJorji wrote:
| I'm curious. If you have a bucket with perhaps half a billion
| objects, what is the use case that leads you to wanting to
| navigate through it with a GUI? Are you perhaps trying to go
| through folders with dates looking for a particular day or
| something?
| jq-r wrote:
| In my previous company we had around 15K instances in a EC2
| region and the EC2 GUI was unusable if it was set on the "new
| gui experience" so we always had to use classic one. The new
| one would try to get all the details of them so once loaded it
| was fast. But to get there it would take many minutes or it
| would just expire. Don't know if they've fixed that.
| twistedpair wrote:
| Honestly this is when most folks move to using their own
| dashboards, metrics, and tooling. The AWS GUIs were designed
| for small to moderate use cases.
|
| You don't peer into a bucket with a billion objects and ask for
| a complete listing, or accounting of bytes. There are tools and
| APIs for that.
|
| That's what I do with my thousands of buckets and billions of
| files (dashboards).
| scapecast wrote:
| It's also the reason why some AWS product teams have started
| acquiring IDE- or CLI-type of start-ups. They don't want to
| be boxed in by the constraints of the AWS Console - which is
| run by a central team. For example, the Redshift team bought
| DataRow.
|
| Disclosure, co-founder here, we're building one of those
| CLIs. We started as an internal project at D2iQ (my co-
| founder Lukas commented further up), with tooling to collect
| an inventory of AWS resources and be able to search it
| easily.
| gfd wrote:
| Does anyone have recommendations on how to compress the data
| (gzip or parquet).
| gtirloni wrote:
| A "TLDR" that is not.
| hughrr wrote:
| For every $100k bill there's a hundred of us with 14TB that costs
| SFA to roll with.
___________________________________________________________________
(page generated 2022-02-17 23:00 UTC)