[HN Gopher] Comparing AWS S3 with Cloudflare R2: Price, Performa...
___________________________________________________________________
Comparing AWS S3 with Cloudflare R2: Price, Performance and User
Experience
Author : randomint64
Score : 162 points
Date : 2024-11-27 15:26 UTC (7 hours ago)
(HTM) web link (kerkour.com)
(TXT) w3m dump (kerkour.com)
| theryanteodoro wrote:
| love a good comparison!
| JOnAgain wrote:
| I _love_ articles like this. Hacker News peeps, please make more!
| kevlened wrote:
| It's not mentioned, but important to note, that R2 lacks object
| versioning.
|
| https://community.cloudflare.com/t/r2-object-versioning-and-...
| UltraSane wrote:
| Ouch. Object versioning is one of the best features of object
| storage. It provides excellent protection from malware and
| human error. My company makes extensive use of versioning and
| Object Lock for protection from malware and data retention
| purposes.
| CharlesW wrote:
| As _@yawnxyz_ mentioned, versioning is straightforward to do
| via Workers (untested sample: https://gist.github.com/Charles
| Wiltgen/84ab145ceda1a972422a8...), and you can also configure
| things so any deletes and other modifications must happen
| through Workers.
| UltraSane wrote:
| Interesting, thanks!
| yawnxyz wrote:
| I built a thin Cloudflare workers script for object versioning
| and it works great
| jw1224 wrote:
| Is this something you'd consider sharing? I know many of us
| would find it really useful!
| jzelinskie wrote:
| This is a great comparison and a great step towards pressure to
| improve cloud service pricing.
|
| The magic that moves the region sounds like a dealbreaker for any
| use cases that aren't public, internet-facing. I use
| $CLOUD_PROVIDER because I can be in the same regions as customers
| and know the latency will (for the most part) remain consistent.
| Has anyone measured latencies from R2 -> AWS/GCP/Azure regions
| similar to this[0]?
|
| Also does anyone know if the R2 supports the CAS operations that
| so many people are hyped about right now?
|
| [0]: https://www.cloudping.co/grid
| xhkkffbf wrote:
| This really is a good article. My only issue is that it
| pretends that the only competition is between Cloudflare and
| AWS. There are several other low rent storage providers that
| offer an S3 compatible API. It's also worth looking at
| Backblaze and Wasabi, for instance. But I don't want to take
| anything away from this article.
| jsheard wrote:
| Is R2 egress _actually_ free, or is it like CFs CDN egress which
| is "free" until they arbitrarily decide you're using it too much
| or using it for the wrong things so now you have to pay
| $undisclosed per GB?
| shivasaxena wrote:
| I would say don't run a casino on cloudflare
| MortyWaves wrote:
| I am also surprised that 4chan is using Cloudflare captcha
| and bot protection
| byyll wrote:
| What is surprising about that? Cloudflare also provides
| services to terrorists, CSAM websites and more.
| telgareith wrote:
| Nice job painting CF as the had guy. They do NOT provide
| services to such, again and again they have terminated
| such for breach of TOS and cooperated with the legal
| system.
| dmd wrote:
| Good to know. Please make an uncontroversial list of all the
| human activities that you think shouldn't be allowed on
| cloudflare (or perhaps in general). Then we can all agree to
| abide by it, and human conflict will end!
| troyvit wrote:
| Cloudflare is a company, not a public utility. If they want
| to disallow any sites that make fun of cuttlefish they get
| to do that. If you want a CDN that follows the rules of a
| public utility I think you're out of luck on this planet.
| dmd wrote:
| I agree with you. I'm saying that _cloudflare_ gets to
| decide that, not a random HN commenter.
| troyvit wrote:
| Doh! Sorry I misunderstood you dmd
| neom wrote:
| In addition to this, if CFs say...payment provider, hated
| people making fun of cuttlefish, it might make sense for
| CF to ban marine molluscs maming there also.
| levifig wrote:
| You can very well run a casino on Cloudflare:
|
| - https://www.cloudflare.com/case-studies/softswiss/
|
| - https://www.cloudflare.com/case-studies/wa-technology/
| steelbrain wrote:
| Do you have any examples of the latter? From what I remember
| reading, the most recent case was a gambling website and
| cloudflare wanted them to upgrade to a tier where they'd have
| their own IPs. This makes sense because some countries blanket
| ban gambling website IPs.
|
| So apart from ToS abuse cases, do you know any other cases? I
| ask as a genuine curiosity because I'm currently paying for
| Cloudflare to host a bunch of our websites at work.
| jsheard wrote:
| Here's some anecdotes I dug up:
| https://news.ycombinator.com/item?id=38960189
|
| Put another way, if Cloudflare _really_ had free unlimited
| CDN egress then every ultra-bandwidth-intensive service like
| Imgur or Steam would use them, but they rarely do, because at
| their scale they get shunted onto the secret real pricing
| that often ends up being more expensive than something like
| Fastly or Akamai. Those competitors would be out of business
| if CF were really as cheap as they want you to think they
| are.
|
| The point where it stops being free seems to depend on a few
| factors, obviously how much data you're moving is one, but
| also the _type_ of data (1GB of images or other binary data
| is considered more harshly than 1GB of HTML /JS/CSS) and
| _where_ the data is served to (1GB of data served to
| Australia or New Zealand is considered much more harshly than
| 1GB to EU /NA). And how much the salesperson assigned to your
| account thinks they can shake you down for, of course.
| hiatus wrote:
| Their terms specifically address video/images:
|
| > Cloudflare's content delivery network (the "CDN") Service
| can be used to cache and serve web pages and websites.
| Unless you are an Enterprise customer, Cloudflare offers
| specific Paid Services (e.g., the Developer Platform,
| Images, and Stream) that you must use in order to serve
| video and other large files via the CDN. Cloudflare
| reserves the right to disable or limit your access to or
| use of the CDN, or to limit your End Users' access to
| certain of your resources through the CDN, if you use or
| are suspected of using the CDN without such Paid Services
| to serve video or a disproportionate percentage of
| pictures, audio files, or other large files. We will use
| reasonable efforts to provide you with notice of such
| action.
|
| https://www.cloudflare.com/service-specific-terms-
| applicatio...
| Aachen wrote:
| I was going to say that it's odd, then, that reddit
| doesn't serve all the posts' json via a free account at
| cloudflare and save a ton of money, but maybe actually
| it's just peanuts on the total costs? So cloudflare is
| basically only happy to host the peanuts for you to get
| you on their platform, but once you want to serve things
| where CDNs (and especially "free" bandwidth) really help,
| it stops being allowed?
| Aperocky wrote:
| I think the comment section of that story is a gold mine:
| https://robindev.substack.com/p/cloudflare-took-down-our-
| web.... Not necessarily authentic, but apply your own
| judgement.
| akira2501 wrote:
| Their ToS enforcement seems weak and/or arbitrary. There are
| a lot of scummy and criminal sites that use their services
| without any issues it seems. At least they generally
| cooperate with law enforcement when requested to do so but
| they otherwise don't seem to notice on their own.
| machinekob wrote:
| Happen before will happen again. CF is a publicly traded
| company and when the squeeze comes, they'll just tax your
| egress as hard as amazon.
| tshaddox wrote:
| It's not unreasonable for a service provider to describe their
| service as "free" even though they will throttle or ban you for
| excessive use.
| breckognize wrote:
| To measure performance the author looked at latency, but most S3
| workloads are throughput oriented. The magic of S3 is that it's
| cheap because it's built on spinning HDDs, which are slow and
| unreliable individually, but when you have millions of them, you
| can mask the tail and deliver multi TBs/sec of throughput.
|
| It's misleading to look at S3 as a CDN. It's fine for that, but
| it's real strength is backing the world's data lakes and cloud
| data warehouses. Those workloads have a lot of data that's often
| cold, but S3 can deliver massive throughout when you need it. R2
| can't do that, and as far as I can tell, isn't trying to.
|
| Source: I used to work on S3
| JoshTriplett wrote:
| Yeah, I'd be interested in the bandwidth as well. Can R2
| saturate 10/25/50 gigabit links? Can it do so with single
| requests, or if not, how many parallel requests does that
| require?
| moralestapia wrote:
| Yes, they absolutely can [1].
|
| 1: https://blog.cloudflare.com/how-cloudflare-auto-mitigated-
| wo...
| fragmede wrote:
| Cloudflare's paid DDoS protection product being able to
| soak up insane L3/4 DDoS attacks doesn't answer the
| question as to whether or not the specific product, R2 from
| Cloudflare which has free egress is able to saturate a
| pipe.
|
| Cloudflare has the network to do that, but they charge
| money to do so with their other offerings, so why would
| they give that to you for free? R2 is not a CDN.
| moralestapia wrote:
| >Can do 3.8 Tbps
|
| >Can't do 10 Gbps
|
| k
| fragmede wrote:
| > can't read CDN
|
| > Can't read R2
|
| k
| bananapub wrote:
| that's completely unrelated. the way to soak up a ddos at
| scale is just "have lots of peering and a fucking massive
| amount of ingress".
|
| neither of these tell you how fast you can serve static
| data.
| moralestapia wrote:
| >that's completely unrelated
|
| Yeah, I'm sure they use a completely different network
| infrastructure to serve R2 requests.
| JoshTriplett wrote:
| That's unrelated to the performance of (for instance) the
| R2 storage layer. All the bandwidth in the world won't help
| you if you're blocked on storage. It isn't clear whether
| the _overall_ performance of R2 is capable of saturating
| user bandwidth, or whether it 'll be blocked on something.
|
| S3 can't saturate user bandwidth unless you make many
| parallel requests. I'd be (pleasantly) surprised if R2 can.
| moralestapia wrote:
| I'm confused, I assumed we were talking about the network
| layer.
|
| If we are talking about storage, well, SATA can't give
| you more than ~5Gbps so I guess the answer is no? But
| also no one else can do it, unless they're using super
| exotic HDD tech (hint: they're not, it's actually the
| opposite).
|
| What a weird thing to argue about, btw, literally
| everybody is running a network layer on top of storage
| that lets you have _much_ higher throughput. When one
| talks about R2 /S3 throughput no one (on my circle, ofc.)
| would think we are referring to the speed of their HDDs,
| lmao. But it's nice to see this, it's always amusing to
| stumble upon people with a wildly different point of view
| on things.
| renewiltord wrote:
| No, most people aren't interested in subcomponent
| performance, just in total performance. A trivial example
| is that even a 4-striped U2 NVMe disk array exported over
| Ethernet can deliver a lot more data than 5 Gbps and
| store mucho TiB.
| moralestapia wrote:
| Thanks for +1 what I just said. So, apparently, it's not
| just me and my peers who think like this.
| JoshTriplett wrote:
| We're talking about the _user-visible_ behavior. You
| argued that because Cloudflare 's CDN has an obscene
| amount of bandwidth, R2 will be able to saturate user
| bandwidth; that doesn't follow, hence my counterpoint
| that it could be bottlenecked on storage rather than
| network. The question at hand is _what performance R2
| offers_ , and that hasn't been answered.
|
| There are any number of ways they _could_ implement R2
| that would allow it to run at full wire speed, but S3
| _doesn 't_ run at full wire speed by default (unless you
| make many parallel requests) and I'd be surprised if R2
| does.
| aipatselarom wrote:
| n = 1 aside.
|
| I have some large files stored in R2 and a 50Gbps
| interface to the world.
|
| curl to Linode's speed test is ~200MB/sec.
|
| curl to R2 is also ~200MB/sec.
|
| I'm only getting 1Gbps but given that Linode's speed is
| pretty much the same I would think the bottleneck is
| somewhere else. Dually, R2 gives you at least 1Gbps.
| michaelt wrote:
| I mean, it may be true in _practice_ that most S3 workloads are
| throughput oriented and unconcerned with latency.
|
| But if you look at https://aws.amazon.com/s3/ it says things
| like:
|
| "Object storage built to retrieve any amount of data from
| anywhere"
|
| "any amount of data for virtually any use case"
|
| "S3 delivers the resiliency, flexibility, latency, and
| throughput, to ensure storage never limits performance"
|
| So if S3 is not intended for low-latency applications, the
| marketing team haven't gotten the message :)
| troyvit wrote:
| lol I think the only reason you're being downvoted is because
| the common belief at HN is, "of course marketing is lying
| and/or doesn't know what they're talking about."
|
| Personally I think you have a point.
| mikeshi42 wrote:
| I didn't downvote but s3 does have low latency offerings
| (express). Which has reasonable latency compared to EFS
| iirc. I'd be shocked if it was as popular as the other
| higher latency s3 tiers though.
| vtuulos wrote:
| yes, this. In case you are interested in seeing some numbers
| backing this claim, see here
| https://outerbounds.com/blog/metaflow-fast-data
|
| Source: I used to work at Netflix, building systems that pull
| TBs from S3 hourly
| suryao wrote:
| Great article. Do you have throughput comparisons? I've found r2
| to be highly variable in throughput, especially with concurrent
| downloads. s3 feels very consistent, but I haven't measured the
| difference.
| pier25 wrote:
| I'm also interested in upload speeds.
|
| I've seen complaints of users about R2 having erratic upload
| speeds.
| vlovich123 wrote:
| Very good article and interesting read. I did want to clarify
| some misconceptions I noted while reading (working from memory so
| hopefully I don't get anything wrong myself).
|
| > As explained here, Durable Objects are single threaded and thus
| limited by nature in the throughput they can offer.
|
| R2 bucket operations do not use single threaded durable objects
| but did a one off thing just for R2 to let it run multiple
| instances even. That's why the limits were lifted in the open
| beta.
|
| > they mentioned that each zone's assets are sharded across
| multiple R2 buckets to distribute load which may indicated that a
| single R2 bucket was not able to handle the load for user-facing
| traffic. Things may have improve since thought.
|
| I would not use this as general advice. Cache Reserve was
| architected to serve an absurd amount of traffic that almost no
| customer or application will see. If you're having that much
| traffic I'd expect you to be an ENT customer working with their
| solutions engineers to design your application.
|
| > First, R2 is not 100% compatible with the S3 API. One notable
| missing feature are data-integrity checks with SHA256 checksums.
|
| This doesn't sound right. I distinctly remember when this was
| implemented for uploading objects. Sha-1 and sha-256 should be
| supported (don't remember about crc). For some reason it's
| missing from the docs though. The trailer version isn't supported
| and likely won't be for a while though for technical reasons (the
| workers platform doesn't support http trailers as it uses http1
| internally). Overall compatibility should be pretty decent.
|
| The section on "The problem with cross-datacenter traffic" seems
| to be flawed assumptions rather than data driven. Their own
| graphs only show that while public buckets have some occasional
| weird spikes it's pretty constantly the same performance while
| the S3 API has more spikeness and time of day variability is much
| more muted than the CPU variability. Same with the assumption on
| bandwidth or other limitations of data centers. The more likely
| explanation would be the S3 auth layer and the time of day
| variability experienced matches more closely with how that layer
| works. I don't know enough of the particulars of this author's
| zones to hypothesize but the s3 with layer was always challenging
| from a perf perspective.
|
| > This is really, really, really annoying. For example you know
| that all your compute instances are in Paris, and you know that
| Cloudflare has a big datacenter in Paris, so you want your bucket
| to be in Paris, but you can't. If you are unlucky when creating
| your bucket, it will be placed in Warsaw or some other place far
| away and you will have huge latencies for every request.
|
| I understand the frustration but there are very good technical
| and UX reasons this wasn't done. For example while you may think
| that "Paris datacenter" is well defined, it isn't for R2 because
| unlike S3 your metadata is stored regionally across multiple data
| centers whereas S3 if I recall correctly uses what they call a
| region which is a single location broken up into multiple
| availability zones which are basically isolated power and
| connectivity domains. This is an availability tradeoff - us-
| east-1 will never go offline on Cloudflare because it just
| doesn't exist - the location hint is the size of the availability
| region. This is done at both the metadata and storage layers too.
| The location hint should definitely be followed when you create
| the bucket but maybe there are bugs or other issues.
|
| As others noted throughput data would also have been interesting.
| tecleandor wrote:
| > First, R2 is not 100% compatible with the S3 API. One notable
| missing feature are data-integrity checks with SHA256
| checksums.
|
| Maybe it was an old thing? The changelog [0] for 2023-06-16
| says:
|
| "S3 putObject now supports sha256 and sha1 checksums."
| [0]: https://developers.cloudflare.com/r2/platform/changelog/#2
| 023-06-16
| vlovich123 wrote:
| I suspect the author is going by the documentation rather
| than having tried themselves
| nickjj wrote:
| One thing to think about with S3 is there's use cases where the
| price is very low which the article didn't mention.
|
| For example maybe you have ~500 GB of data across millions of
| objects that has accumulated over 10 years. You don't even know
| how many reads or writes you have on a monthly basis because your
| S3 bill is $11 while your total AWS bill is orders of magnitude
| more.
|
| If you're in a spot like this, moving to R2 to potentially save
| $7 or whatever it ends up being would end up being a lot more
| expensive from the engineering costs to do the move. Plus there's
| old links that might be pointing to a public S3 object which
| would break if you moved them to another location such as email
| campaign links, etc..
| philistine wrote:
| Even simpler: I'm using Glacier Deep Archive for my personal
| backups, and I don't see how R2 would be cheaper for me.
| telgareith wrote:
| Have you priced the retrieval cost? You quickly run into high
| 3 and then 4 figures worth of bandwidth.
| philistine wrote:
| Retrieval? For an external backup? If I need to restore and
| my local backup is completely down, it either means I lost
| two drives (very unlikely) or the house is a calcinated
| husk and at this point I'm insured.
|
| And let's be honest. If the house burns down, the computers
| are the third thing I get out of there after the wife and
| the dog. My external backup is peace of mind, nothing more.
| I don't ever expect to need it in my lifetime.
| seized wrote:
| Yes, but if it's your third location of 3-2-1 then it can
| also make sense to weigh it against data recovery costs on
| damaged hardware.
|
| I backup to Glacier as well. For me to need to pull from it
| (and pay that $90/TB or so) means I've lost more than two
| drives in a historically very reliable RAIDZ2 pool, or lost
| my NAS entirely.
|
| I'll pay $90/TB over unknown $$$$ for a data recovery from
| burned/flooded/fried/failed disks.
| kiwijamo wrote:
| High 3 and 4 figures wouldn't occur for personal backups
| though. I've done a big retrieval once and the cost was
| literally just single digits dollars for me. So the total
| lifetime cost (including retrievals) is cheaper on S3 than
| R2 for my personal backup use case. This is why I struggle
| to take seriously any analysis that says S3 is expensive --
| it is only expensive if you use the most expensive
| (default) S3 product. S3 has more options to offer than
| than R2 or other competitors which is why I stay with S3
| and pay <$1.00 a month for my entire backup. Most
| competitors (including R2) would have me pay significantly
| more than I spend on the appropriate S3 product.
| Dylan16807 wrote:
| I think the most reasonable way to analyze this puts non-
| instant-access Glacier in a separate category from the rest
| of S3. R2 doesn't beat it, but R2 is not a competitor in the
| first place.
| bassp wrote:
| Only tangentially related to the article, but I've never
| understood _how_ R2 offers 11 9s of durability. I trust that S3
| offers 11 9s because Amazon has shown, publicly, that they care a
| ton about designing reliable, fault tolerant, correct systems (eg
| Shardstore and Shuttle)
|
| Cloudflare's documentation just says "we offer 11 9s, same as
| S3", and that's that. It's not that I don't believe them but...
| how can a smaller organization make the same guarantees?
|
| It implies to me that either Amazon is wasting a ton of money on
| their reliability work (possible) or that cloudflare's 11 9s
| guarantee comes with some asterisks.
| rat9988 wrote:
| What makes you think it did cost aws that much moneu at their
| scale to achieve 11 9s that cloudflare cannot afford it?
| bassp wrote:
| Minimally, the two examples I cited: Shardstore and Shuttle.
| The former is a (lightweight) formally verified key value
| store used by S3, and the latter is a model checker for
| concurrent rust code.
|
| Amazon has an entire automated reasoning group (researchers
| who mostly work on formal methods) working specifically on
| S3.
|
| As far as I'm aware, nobody at cloudflare is doing similar
| work for R2. If they are, they're certainly not publishing!
|
| Money might not be the bottleneck for cloudflare though,
| you're totally right
| zild3d wrote:
| S3 has touted 11 9's for many years, so before shardstore
| definitely.
|
| The 11 9's is for durability, which is really more about
| the redundancy setup, erasure coding, etc.
| (https://cloud.google.com/blog/products/storage-data-
| transfer...)
|
| fwiw availability is 4 9's
| (https://aws.amazon.com/s3/storage-classes/)
| bassp wrote:
| That's a good point!
|
| I think I overstated the case a little, I definitely
| don't think automated reasoning is some "secret
| reliability sauce" that nobody else can replicate; it
| does give me more confidence that Amazon takes
| reliability very seriously, and is less likely to ship a
| terrible bug that messes up my data.
| cube2222 wrote:
| R2 and its pricing is quite fantastic.
|
| We're using it to power the OpenTofu Provider&Modules
| Registry[0][1] and it's honestly been nothing but a great
| experience overall.
|
| [0]: https://registry.opentofu.org
|
| [1]: https://github.com/opentofu/registry
|
| Disclaimer: CloudFlare did sponsor us their business plan so we
| got access to higher-tier functionality
| deanCommie wrote:
| The innovator's dilemma is really interesting.
|
| Whenever a new incumbent gets on the scene offering the same
| thing as some entrenched leader only better, faster, and cheaper,
| the standard response is "Yeah but it's less reliable. This may
| be fine for startups but if you're
| <enterprise|government|military|medical|etc>, you gotta stick
| with the tried tested and true <leader>"
|
| You see this in almost every discussion of Cloudflare, which
| seems to be rapidly rebuilding a full cloud, in direct
| competition with AWS specifically. (I guess it wants to be
| evaluated as a fellow leader, not an also-ran like GCP/Azure
| fighting for 2nd place)
|
| The thing is, all the points are right. Cloudflare IS different -
| by using exclusively edge networks and tying everything to CDNs,
| it's both a strength and a weakness. There's dozens of reasons to
| be critical of them and dozens more to explain why you'd trust
| AWS more.
|
| But I can't help but wonder that surely the same happened (i
| wasn't on here, or really tech-aware enough) when S3 and EC2 came
| on the scene. I'm sure everyone said it was unreliable,
| uncertain, and had dozens of reasons why people should stick with
| (I can only presume - VMWare, IBM, Oracle, etc?)
|
| This is all a shallow observation though.
|
| Here's my real question, though. How does one go deeper and
| evaluate what is real disruption and what is fluff. Does
| Cloudflare have something that's unique and different that
| demonstrates a new world for cloud services I can't even imagine
| right now, as AWS did before it. Or does AWS have a durable
| advantage and benefits that will allow it to keep being #1
| indefinitely? (GCP and Azure, as I see it, are trying to compete
| on specific slices of merit. GCP is all-in on 'portability',
| that's why they came up with Kubernetes to devalue the idea of
| any one public cloud, and make workloads cross-platform across
| all clouds and on-prem. Azure seems to be competitive because of
| Microsoft's otherwise vertical integration with
| business/windows/office, and now AI services).
|
| Cloudflare is the only one that seems to show up over and over
| again and say "hey you know that thing that you think is the best
| cloud service? We made it cheaper, faster, and with nicer
| developer experience." That feels really hard to ignore. But also
| seems really easy to market only-semi-honestly by hand-waving
| past the hard stuff at scale.
| everfrustrated wrote:
| Cloudflares architecture is driven purely by their history of
| being a CDN and trying to find new product lines to generate
| new revenue streams to keep share price up.
|
| You wouldn't build a cloud from scratch in this way.
| youngtaff wrote:
| Maybe Cloudflare will even be profitable in the next year or
| two...
| orf wrote:
| My experience: I put parquet files on R2, but HTTP Range requests
| were failing. 50% of the time it would work, and 50% of the time
| it would return _all_ of the content and not the subset
| requested. That's a nightmare to debug, given that software
| expects it to work consistently or not work at all.
|
| Seems like a bug. Had to crawl through documentation to find out
| the only support is on Discord (??), so I had to sign up.
|
| Go through some more hoops and eventually get to a channel where
| I received a prompt reply: it's not an R2 issue, it's "expected
| behaviour due to an issue with "the CDN service".
|
| I mean, sure. On a technical level. But I shoved some data into
| your service and basic standard HTTP semantics where
| intermittently not respected: that's a bug in your service, even
| if the root cause is another team.
|
| None of this is documented anywhere, even if it is "expected".
| Searching for [1] "r2 http range" shows I'm not the only one
| surprised
|
| Not impressed, especially as R2 seems ideal for serving Parquet
| data for small projects. This and the janky UI plus weird
| restrictions makes the entire product feel distinctly half
| finished and not a serious competitor.
|
| 1.
| https://www.google.com/search?q=r2+http+range&ie=UTF-8&oe=UT...
| saurik wrote:
| > given that software expects it to work consistently or not
| work at all
|
| I mean... that's wrong? If you come across such software, do
| you at least file a bug?
| orf wrote:
| Of course not, and it's completely correct behaviour: if a
| server advertises it supports Range requests for a given URL,
| it's expected to _support_ it. Garbage in, garbage out.
|
| It's not clear how you'd expect to handle a webserver trying
| to send you 1Gb of data after you asked for a specific 10kb
| range other than aborting.
| saurik wrote:
| "Conversely, a client MUST NOT assume that receiving an
| Accept-Ranges field means that future range requests will
| return partial responses. The content might change, the
| server might only support range requests at certain times
| or under certain conditions, or a different intermediary
| might process the next request." -- RFC 9110
| orf wrote:
| Sure, but that's utterly useless in practice because
| there is no way to handle that gracefully.
|
| To be clear: most software does handle it, because it
| detects this case and aborts.
|
| But to a user who is explicitly asking to read a parquet
| file without buffering the entire file into memory, there
| is no distinction between a server that cannot handle any
| range requests and a server that can occasionally handle
| range requests.
|
| Other than one being much, much more annoying.
| MobileVet wrote:
| One thing that I haven't seen discussed in the comments is the
| inherent vulnerability of S3 pricing. Like all things AWS, if
| something goes sideways, you are suddenly on the wrong side of a
| very large bill. For instance, someone can easily blow your
| egress charges through the roof by making a massive amount of
| requests for your assets hosted there.
|
| While Cloudflare may reach out and say 'you should be on
| enterprise' when that happens on R2, the fact they also handle
| DDoS and similar attacks as part of their offering means the
| likelihood of success is much lower (as is the final bill).
| sroussey wrote:
| Cloudflare has DDOS roots and it plays well here.
| akira2501 wrote:
| Typically you would use S3 with CloudFront for hosting. S3
| provides no protections because it's meant to be a durable and
| global service. CloudFront provides DDoS and other types of
| protection while making it easy to get prepaid bandwidth
| discounts.
| danielheath wrote:
| Just one data point, but adding Cloudflare to our stack (in
| front of "CloudFront with bandwidth discounts") took about
| $30k USD per year off our bandwidth bill.
| lopkeny12ko wrote:
| I'm not really sure what point you're trying to make here. S3
| bills you on, essentially, serving files to your customers. So
| yes if your customers download more files then you get charged
| more. What exactly is the surprising part here
| zaptheimpaler wrote:
| The surprise is any ne'er-do-well can DDoS your bucket even
| if they aren't a customer. Genuine customer traffic volume
| will probably be known and expected, but putting an S3 bucket
| in the open is something like leaving a blank check on the
| internet.
| lopkeny12ko wrote:
| It's a bit unfair to characterize that as a surprise on how
| much S3 bills you, no? The surprising part here is lack of
| DDoS protection on your end or leaving a bucket public and
| exposed. AWS is just charging you for how much it served,
| it doesn't make sense to hold them to a fault here.
| fnikacevic wrote:
| AWS will also forgive mistakes or negligence based bills,
| in my case 3 times.
| bobthebutcher wrote:
| If you want to hire someone to walk your dog you probably
| won't put an ad in the New york times to a head hunter
| that you will pay by the hour with no oversight and it
| would be totally unfair to that head hunter when you
| don't want to pay them for the time of all those
| interviews. But an infinitely scalable service you
| somehow can't put immediately terminal limits on is
| somehow fine on the cloud.
| bippihippi1 wrote:
| it loses trust with customers when the simple setup is
| flawed. S3 is rightly built to support as much egress as
| any customer would want, but wrong to make it complex to
| set up rules to limit the bandwidth and price.
|
| It should be possible to use the service, especially
| common ones like S3 with little knowledge of architecture
| and stuff.
| Dylan16807 wrote:
| > The surprising part here is lack of DDoS protection on
| your end or leaving a bucket public and exposed.
|
| It doesn't take anything near DDoS. If you dare to put up
| a website that serves images from S3, and one guy on one
| normal connection decides to cause you problems, they can
| pull down a hundred terabytes in a month.
|
| Is serving images from S3 a crazy use case? Even if have
| signed and expiring URLs it's hard to avoid someone
| visiting your site every half hour and then using the URL
| over and over.
|
| > AWS is just charging you for how much it served, it
| doesn't make sense to hold them to a fault here.
|
| Even if it's not their fault, it's still an "inherent
| vulnerability of S3 pricing". But since they charge so
| much per byte with bad controls over it, I think it does
| make sense to hold them to a good chunk of fault.
| zaptheimpaler wrote:
| I don't know about fair or unfair, but it's just a
| problem you don't have to worry about if there's no
| egress fees.
| karmakaze wrote:
| There was a backlash about being billed for unauthorized
| requests. It's since been updated[0]. I don't know that all
| affected was retroactively refunded.
|
| [0] https://aws.amazon.com/about-aws/whats-
| new/2024/08/amazon-s3...
| jonathannorris wrote:
| Also, once you are on Enterprise, they will not bug/charge you
| for contracted overages very often (like once a year) and will
| forgive significant overages if you resolve them quickly, in my
| experience.
| viraptor wrote:
| IAM gets only a tiny mention as not present, therefore making R2
| simpler. But also... IAM is missing and a lot of interesting use
| cases are not possible there. No access by path, no 2fa
| enforcing, no easy SSO management, no blast radius limits - just
| would you like a token which can write a file, but also delete
| everything? This is also annoying for their zone management for
| the same reason.
| pier25 wrote:
| > _you can 't chose the location of your R2 bucket!_
|
| Yeah this is really annoying. That and replication to multiple
| regions is the reason we're not using R2.
|
| Global replication was a feature announced in 2021 but still
| hasn't happened:
|
| > _R2 will replicate data across multiple regions and support
| jurisdictional restrictions, giving businesses the ability to
| control where their data is stored to meet their local and global
| needs._
|
| https://www.cloudflare.com/press-releases/2021/cloudflare-an...
| tlarkworthy wrote:
| I've benchmarked R2 and S3 and S3 is well ahead in terms of
| latency _especially_ on ListObject requests. I think R2 has come
| kind of concurrency limit as concurrent ListObject requests seem
| to to have increase failure rate when serving simultaneous
| requests
|
| I have a few of the S3-like wired up live over the internet you
| can try yourself in your browser. Backblaze is surprisingly
| performant which I did not expect (S3 is still king though)
|
| https://observablehq.com/@tomlarkworthy/mps3-vendor-examples
| postatic wrote:
| I do mostly CRUD apps with Laravel and Vue. Nothing too
| complicated. Allows users to post stuff with images and files.
| I've moved ALL of my files from S3 to R2 in the past 2 years.
| It's been slow as any migrations are but painless.
|
| But most importantly for an indie dev like me the cost became $0.
| snihalani wrote:
| >Generally, R2's user experience is way better and simpler than
| S3. As always with AWS, you need 5 certifications and 3 months to
| securely deploy a bucket.
|
| +1
| asteroidburger wrote:
| "Here's a bunch of great things about CloudFlare R2 - and please
| buy my book about it" leaves a bad taste in my mouth.
|
| Also, has CF improved their stance around hosting hate groups?
| They have strongly resisted pressure to stop hosting/supporting
| hate sites like 8chan and Kiwifarms, and only stopped
| reluctantly.
| gjsman-1000 wrote:
| I don't have to support 8chan or KiwiFarms to say that
| Cloudflare has absolutely no role in policing the internet. The
| job of policing the internet is for the police. If it's
| illegal, let them investigate.
| asteroidburger wrote:
| There is a difference between policing the internet and
| supplying resources and services to known bad actors.
|
| Their job isn't to investigate and punish harassment and
| criminal behavior, but they certainly don't have to condone
| it via their support.
| gjsman-1000 wrote:
| > known bad actors
|
| If they are known bad actors, let the police do the job of
| policing the internet. Otherwise, all bad actors are
| ultimately arbitrarily defined. _Who_ said they are known
| bad actors? What does that even mean? _Why_ does that
| person determining bad actors get their authority? Were
| they duly elected? Or did one of hundreds of partisan NGOs
| claim this? Who elected the NGO? Does PETA get a say on bad
| actors?
|
| Be careful what you wish for. In some US States, I am sure
| the attorney general would send a letter saying to shut
| down the marijuana dispensary - they're known bad actors,
| after all. They might not win a lawsuit, but winning the
| support of private organizations would be just as good.
|
| > they certainly don't have to condone it via their support
|
| Wow, what a great argument. Hacker News supports all
| arguments here by tolerating people speaking and not
| deleting everything they could possibly disagree with.
|
| Or maybe, providing a service to someone, should not be
| seen as condoning all possible uses of the service. Just
| because water can be used to waterboard someone, doesn't
| mean Walmart should be checking IDs for water purchasers.
| Just because YouTube has information on how to pick locks,
| does not mean YouTube should be restricted to adults over
| 21 on a licensed list of people entrusted with lock-picking
| knowledge.
| karmakaze wrote:
| At one company we were uploading videos to S3 and finding a lot
| of errors or stalls in the process. That led to evaluating GCP
| and Azure. I found that Azure had the most consistent (least
| variance) in upload durations and better pricing. We ended up
| using GCP for other reasons like resumable uploads (IIRC). AWS
| now supports appending to S3 objects which might have worked to
| avoid upload stalls. CloudFront for us at the time was
| overpriced.
| kansi wrote:
| I have tried to find a CDN provider which would offer access
| control similar to Cloudfront's signed cookies but failed to find
| something that would match it. This is a major drawback with
| these providers offering S3 style bucket storage because most of
| time you would want to serve the content from a CDN and
| offloading access control to CDN via cookies makes life so much
| easier. You only need to set the cookies for the user's session
| once and they are automatically sent (by the web browser) to the
| CDN with no additional work needed
| denysvitali wrote:
| No mention of Backblaze's B2? It's cheaper than these two at just
| 6$/TB
___________________________________________________________________
(page generated 2024-11-27 23:01 UTC)