[HN Gopher] Comparing AWS S3 with Cloudflare R2: Price, Performa...
       ___________________________________________________________________
        
       Comparing AWS S3 with Cloudflare R2: Price, Performance and User
       Experience
        
       Author : randomint64
       Score  : 162 points
       Date   : 2024-11-27 15:26 UTC (7 hours ago)
        
 (HTM) web link (kerkour.com)
 (TXT) w3m dump (kerkour.com)
        
       | theryanteodoro wrote:
       | love a good comparison!
        
       | JOnAgain wrote:
       | I _love_ articles like this. Hacker News peeps, please make more!
        
       | kevlened wrote:
       | It's not mentioned, but important to note, that R2 lacks object
       | versioning.
       | 
       | https://community.cloudflare.com/t/r2-object-versioning-and-...
        
         | UltraSane wrote:
         | Ouch. Object versioning is one of the best features of object
         | storage. It provides excellent protection from malware and
         | human error. My company makes extensive use of versioning and
         | Object Lock for protection from malware and data retention
         | purposes.
        
           | CharlesW wrote:
           | As _@yawnxyz_ mentioned, versioning is straightforward to do
           | via Workers (untested sample: https://gist.github.com/Charles
           | Wiltgen/84ab145ceda1a972422a8...), and you can also configure
           | things so any deletes and other modifications must happen
           | through Workers.
        
             | UltraSane wrote:
             | Interesting, thanks!
        
         | yawnxyz wrote:
         | I built a thin Cloudflare workers script for object versioning
         | and it works great
        
           | jw1224 wrote:
           | Is this something you'd consider sharing? I know many of us
           | would find it really useful!
        
       | jzelinskie wrote:
       | This is a great comparison and a great step towards pressure to
       | improve cloud service pricing.
       | 
       | The magic that moves the region sounds like a dealbreaker for any
       | use cases that aren't public, internet-facing. I use
       | $CLOUD_PROVIDER because I can be in the same regions as customers
       | and know the latency will (for the most part) remain consistent.
       | Has anyone measured latencies from R2 -> AWS/GCP/Azure regions
       | similar to this[0]?
       | 
       | Also does anyone know if the R2 supports the CAS operations that
       | so many people are hyped about right now?
       | 
       | [0]: https://www.cloudping.co/grid
        
         | xhkkffbf wrote:
         | This really is a good article. My only issue is that it
         | pretends that the only competition is between Cloudflare and
         | AWS. There are several other low rent storage providers that
         | offer an S3 compatible API. It's also worth looking at
         | Backblaze and Wasabi, for instance. But I don't want to take
         | anything away from this article.
        
       | jsheard wrote:
       | Is R2 egress _actually_ free, or is it like CFs CDN egress which
       | is  "free" until they arbitrarily decide you're using it too much
       | or using it for the wrong things so now you have to pay
       | $undisclosed per GB?
        
         | shivasaxena wrote:
         | I would say don't run a casino on cloudflare
        
           | MortyWaves wrote:
           | I am also surprised that 4chan is using Cloudflare captcha
           | and bot protection
        
             | byyll wrote:
             | What is surprising about that? Cloudflare also provides
             | services to terrorists, CSAM websites and more.
        
               | telgareith wrote:
               | Nice job painting CF as the had guy. They do NOT provide
               | services to such, again and again they have terminated
               | such for breach of TOS and cooperated with the legal
               | system.
        
           | dmd wrote:
           | Good to know. Please make an uncontroversial list of all the
           | human activities that you think shouldn't be allowed on
           | cloudflare (or perhaps in general). Then we can all agree to
           | abide by it, and human conflict will end!
        
             | troyvit wrote:
             | Cloudflare is a company, not a public utility. If they want
             | to disallow any sites that make fun of cuttlefish they get
             | to do that. If you want a CDN that follows the rules of a
             | public utility I think you're out of luck on this planet.
        
               | dmd wrote:
               | I agree with you. I'm saying that _cloudflare_ gets to
               | decide that, not a random HN commenter.
        
               | troyvit wrote:
               | Doh! Sorry I misunderstood you dmd
        
               | neom wrote:
               | In addition to this, if CFs say...payment provider, hated
               | people making fun of cuttlefish, it might make sense for
               | CF to ban marine molluscs maming there also.
        
           | levifig wrote:
           | You can very well run a casino on Cloudflare:
           | 
           | - https://www.cloudflare.com/case-studies/softswiss/
           | 
           | - https://www.cloudflare.com/case-studies/wa-technology/
        
         | steelbrain wrote:
         | Do you have any examples of the latter? From what I remember
         | reading, the most recent case was a gambling website and
         | cloudflare wanted them to upgrade to a tier where they'd have
         | their own IPs. This makes sense because some countries blanket
         | ban gambling website IPs.
         | 
         | So apart from ToS abuse cases, do you know any other cases? I
         | ask as a genuine curiosity because I'm currently paying for
         | Cloudflare to host a bunch of our websites at work.
        
           | jsheard wrote:
           | Here's some anecdotes I dug up:
           | https://news.ycombinator.com/item?id=38960189
           | 
           | Put another way, if Cloudflare _really_ had free unlimited
           | CDN egress then every ultra-bandwidth-intensive service like
           | Imgur or Steam would use them, but they rarely do, because at
           | their scale they get shunted onto the secret real pricing
           | that often ends up being more expensive than something like
           | Fastly or Akamai. Those competitors would be out of business
           | if CF were really as cheap as they want you to think they
           | are.
           | 
           | The point where it stops being free seems to depend on a few
           | factors, obviously how much data you're moving is one, but
           | also the _type_ of data (1GB of images or other binary data
           | is considered more harshly than 1GB of HTML /JS/CSS) and
           | _where_ the data is served to (1GB of data served to
           | Australia or New Zealand is considered much more harshly than
           | 1GB to EU /NA). And how much the salesperson assigned to your
           | account thinks they can shake you down for, of course.
        
             | hiatus wrote:
             | Their terms specifically address video/images:
             | 
             | > Cloudflare's content delivery network (the "CDN") Service
             | can be used to cache and serve web pages and websites.
             | Unless you are an Enterprise customer, Cloudflare offers
             | specific Paid Services (e.g., the Developer Platform,
             | Images, and Stream) that you must use in order to serve
             | video and other large files via the CDN. Cloudflare
             | reserves the right to disable or limit your access to or
             | use of the CDN, or to limit your End Users' access to
             | certain of your resources through the CDN, if you use or
             | are suspected of using the CDN without such Paid Services
             | to serve video or a disproportionate percentage of
             | pictures, audio files, or other large files. We will use
             | reasonable efforts to provide you with notice of such
             | action.
             | 
             | https://www.cloudflare.com/service-specific-terms-
             | applicatio...
        
               | Aachen wrote:
               | I was going to say that it's odd, then, that reddit
               | doesn't serve all the posts' json via a free account at
               | cloudflare and save a ton of money, but maybe actually
               | it's just peanuts on the total costs? So cloudflare is
               | basically only happy to host the peanuts for you to get
               | you on their platform, but once you want to serve things
               | where CDNs (and especially "free" bandwidth) really help,
               | it stops being allowed?
        
           | Aperocky wrote:
           | I think the comment section of that story is a gold mine:
           | https://robindev.substack.com/p/cloudflare-took-down-our-
           | web.... Not necessarily authentic, but apply your own
           | judgement.
        
           | akira2501 wrote:
           | Their ToS enforcement seems weak and/or arbitrary. There are
           | a lot of scummy and criminal sites that use their services
           | without any issues it seems. At least they generally
           | cooperate with law enforcement when requested to do so but
           | they otherwise don't seem to notice on their own.
        
         | machinekob wrote:
         | Happen before will happen again. CF is a publicly traded
         | company and when the squeeze comes, they'll just tax your
         | egress as hard as amazon.
        
         | tshaddox wrote:
         | It's not unreasonable for a service provider to describe their
         | service as "free" even though they will throttle or ban you for
         | excessive use.
        
       | breckognize wrote:
       | To measure performance the author looked at latency, but most S3
       | workloads are throughput oriented. The magic of S3 is that it's
       | cheap because it's built on spinning HDDs, which are slow and
       | unreliable individually, but when you have millions of them, you
       | can mask the tail and deliver multi TBs/sec of throughput.
       | 
       | It's misleading to look at S3 as a CDN. It's fine for that, but
       | it's real strength is backing the world's data lakes and cloud
       | data warehouses. Those workloads have a lot of data that's often
       | cold, but S3 can deliver massive throughout when you need it. R2
       | can't do that, and as far as I can tell, isn't trying to.
       | 
       | Source: I used to work on S3
        
         | JoshTriplett wrote:
         | Yeah, I'd be interested in the bandwidth as well. Can R2
         | saturate 10/25/50 gigabit links? Can it do so with single
         | requests, or if not, how many parallel requests does that
         | require?
        
           | moralestapia wrote:
           | Yes, they absolutely can [1].
           | 
           | 1: https://blog.cloudflare.com/how-cloudflare-auto-mitigated-
           | wo...
        
             | fragmede wrote:
             | Cloudflare's paid DDoS protection product being able to
             | soak up insane L3/4 DDoS attacks doesn't answer the
             | question as to whether or not the specific product, R2 from
             | Cloudflare which has free egress is able to saturate a
             | pipe.
             | 
             | Cloudflare has the network to do that, but they charge
             | money to do so with their other offerings, so why would
             | they give that to you for free? R2 is not a CDN.
        
               | moralestapia wrote:
               | >Can do 3.8 Tbps
               | 
               | >Can't do 10 Gbps
               | 
               | k
        
               | fragmede wrote:
               | > can't read CDN
               | 
               | > Can't read R2
               | 
               | k
        
             | bananapub wrote:
             | that's completely unrelated. the way to soak up a ddos at
             | scale is just "have lots of peering and a fucking massive
             | amount of ingress".
             | 
             | neither of these tell you how fast you can serve static
             | data.
        
               | moralestapia wrote:
               | >that's completely unrelated
               | 
               | Yeah, I'm sure they use a completely different network
               | infrastructure to serve R2 requests.
        
             | JoshTriplett wrote:
             | That's unrelated to the performance of (for instance) the
             | R2 storage layer. All the bandwidth in the world won't help
             | you if you're blocked on storage. It isn't clear whether
             | the _overall_ performance of R2 is capable of saturating
             | user bandwidth, or whether it 'll be blocked on something.
             | 
             | S3 can't saturate user bandwidth unless you make many
             | parallel requests. I'd be (pleasantly) surprised if R2 can.
        
               | moralestapia wrote:
               | I'm confused, I assumed we were talking about the network
               | layer.
               | 
               | If we are talking about storage, well, SATA can't give
               | you more than ~5Gbps so I guess the answer is no? But
               | also no one else can do it, unless they're using super
               | exotic HDD tech (hint: they're not, it's actually the
               | opposite).
               | 
               | What a weird thing to argue about, btw, literally
               | everybody is running a network layer on top of storage
               | that lets you have _much_ higher throughput. When one
               | talks about R2 /S3 throughput no one (on my circle, ofc.)
               | would think we are referring to the speed of their HDDs,
               | lmao. But it's nice to see this, it's always amusing to
               | stumble upon people with a wildly different point of view
               | on things.
        
               | renewiltord wrote:
               | No, most people aren't interested in subcomponent
               | performance, just in total performance. A trivial example
               | is that even a 4-striped U2 NVMe disk array exported over
               | Ethernet can deliver a lot more data than 5 Gbps and
               | store mucho TiB.
        
               | moralestapia wrote:
               | Thanks for +1 what I just said. So, apparently, it's not
               | just me and my peers who think like this.
        
               | JoshTriplett wrote:
               | We're talking about the _user-visible_ behavior. You
               | argued that because Cloudflare 's CDN has an obscene
               | amount of bandwidth, R2 will be able to saturate user
               | bandwidth; that doesn't follow, hence my counterpoint
               | that it could be bottlenecked on storage rather than
               | network. The question at hand is _what performance R2
               | offers_ , and that hasn't been answered.
               | 
               | There are any number of ways they _could_ implement R2
               | that would allow it to run at full wire speed, but S3
               | _doesn 't_ run at full wire speed by default (unless you
               | make many parallel requests) and I'd be surprised if R2
               | does.
        
               | aipatselarom wrote:
               | n = 1 aside.
               | 
               | I have some large files stored in R2 and a 50Gbps
               | interface to the world.
               | 
               | curl to Linode's speed test is ~200MB/sec.
               | 
               | curl to R2 is also ~200MB/sec.
               | 
               | I'm only getting 1Gbps but given that Linode's speed is
               | pretty much the same I would think the bottleneck is
               | somewhere else. Dually, R2 gives you at least 1Gbps.
        
         | michaelt wrote:
         | I mean, it may be true in _practice_ that most S3 workloads are
         | throughput oriented and unconcerned with latency.
         | 
         | But if you look at https://aws.amazon.com/s3/ it says things
         | like:
         | 
         | "Object storage built to retrieve any amount of data from
         | anywhere"
         | 
         | "any amount of data for virtually any use case"
         | 
         | "S3 delivers the resiliency, flexibility, latency, and
         | throughput, to ensure storage never limits performance"
         | 
         | So if S3 is not intended for low-latency applications, the
         | marketing team haven't gotten the message :)
        
           | troyvit wrote:
           | lol I think the only reason you're being downvoted is because
           | the common belief at HN is, "of course marketing is lying
           | and/or doesn't know what they're talking about."
           | 
           | Personally I think you have a point.
        
             | mikeshi42 wrote:
             | I didn't downvote but s3 does have low latency offerings
             | (express). Which has reasonable latency compared to EFS
             | iirc. I'd be shocked if it was as popular as the other
             | higher latency s3 tiers though.
        
         | vtuulos wrote:
         | yes, this. In case you are interested in seeing some numbers
         | backing this claim, see here
         | https://outerbounds.com/blog/metaflow-fast-data
         | 
         | Source: I used to work at Netflix, building systems that pull
         | TBs from S3 hourly
        
       | suryao wrote:
       | Great article. Do you have throughput comparisons? I've found r2
       | to be highly variable in throughput, especially with concurrent
       | downloads. s3 feels very consistent, but I haven't measured the
       | difference.
        
         | pier25 wrote:
         | I'm also interested in upload speeds.
         | 
         | I've seen complaints of users about R2 having erratic upload
         | speeds.
        
       | vlovich123 wrote:
       | Very good article and interesting read. I did want to clarify
       | some misconceptions I noted while reading (working from memory so
       | hopefully I don't get anything wrong myself).
       | 
       | > As explained here, Durable Objects are single threaded and thus
       | limited by nature in the throughput they can offer.
       | 
       | R2 bucket operations do not use single threaded durable objects
       | but did a one off thing just for R2 to let it run multiple
       | instances even. That's why the limits were lifted in the open
       | beta.
       | 
       | > they mentioned that each zone's assets are sharded across
       | multiple R2 buckets to distribute load which may indicated that a
       | single R2 bucket was not able to handle the load for user-facing
       | traffic. Things may have improve since thought.
       | 
       | I would not use this as general advice. Cache Reserve was
       | architected to serve an absurd amount of traffic that almost no
       | customer or application will see. If you're having that much
       | traffic I'd expect you to be an ENT customer working with their
       | solutions engineers to design your application.
       | 
       | > First, R2 is not 100% compatible with the S3 API. One notable
       | missing feature are data-integrity checks with SHA256 checksums.
       | 
       | This doesn't sound right. I distinctly remember when this was
       | implemented for uploading objects. Sha-1 and sha-256 should be
       | supported (don't remember about crc). For some reason it's
       | missing from the docs though. The trailer version isn't supported
       | and likely won't be for a while though for technical reasons (the
       | workers platform doesn't support http trailers as it uses http1
       | internally). Overall compatibility should be pretty decent.
       | 
       | The section on "The problem with cross-datacenter traffic" seems
       | to be flawed assumptions rather than data driven. Their own
       | graphs only show that while public buckets have some occasional
       | weird spikes it's pretty constantly the same performance while
       | the S3 API has more spikeness and time of day variability is much
       | more muted than the CPU variability. Same with the assumption on
       | bandwidth or other limitations of data centers. The more likely
       | explanation would be the S3 auth layer and the time of day
       | variability experienced matches more closely with how that layer
       | works. I don't know enough of the particulars of this author's
       | zones to hypothesize but the s3 with layer was always challenging
       | from a perf perspective.
       | 
       | > This is really, really, really annoying. For example you know
       | that all your compute instances are in Paris, and you know that
       | Cloudflare has a big datacenter in Paris, so you want your bucket
       | to be in Paris, but you can't. If you are unlucky when creating
       | your bucket, it will be placed in Warsaw or some other place far
       | away and you will have huge latencies for every request.
       | 
       | I understand the frustration but there are very good technical
       | and UX reasons this wasn't done. For example while you may think
       | that "Paris datacenter" is well defined, it isn't for R2 because
       | unlike S3 your metadata is stored regionally across multiple data
       | centers whereas S3 if I recall correctly uses what they call a
       | region which is a single location broken up into multiple
       | availability zones which are basically isolated power and
       | connectivity domains. This is an availability tradeoff - us-
       | east-1 will never go offline on Cloudflare because it just
       | doesn't exist - the location hint is the size of the availability
       | region. This is done at both the metadata and storage layers too.
       | The location hint should definitely be followed when you create
       | the bucket but maybe there are bugs or other issues.
       | 
       | As others noted throughput data would also have been interesting.
        
         | tecleandor wrote:
         | > First, R2 is not 100% compatible with the S3 API. One notable
         | missing feature are data-integrity checks with SHA256
         | checksums.
         | 
         | Maybe it was an old thing? The changelog [0] for 2023-06-16
         | says:
         | 
         | "S3 putObject now supports sha256 and sha1 checksums."
         | [0]: https://developers.cloudflare.com/r2/platform/changelog/#2
         | 023-06-16
        
           | vlovich123 wrote:
           | I suspect the author is going by the documentation rather
           | than having tried themselves
        
       | nickjj wrote:
       | One thing to think about with S3 is there's use cases where the
       | price is very low which the article didn't mention.
       | 
       | For example maybe you have ~500 GB of data across millions of
       | objects that has accumulated over 10 years. You don't even know
       | how many reads or writes you have on a monthly basis because your
       | S3 bill is $11 while your total AWS bill is orders of magnitude
       | more.
       | 
       | If you're in a spot like this, moving to R2 to potentially save
       | $7 or whatever it ends up being would end up being a lot more
       | expensive from the engineering costs to do the move. Plus there's
       | old links that might be pointing to a public S3 object which
       | would break if you moved them to another location such as email
       | campaign links, etc..
        
         | philistine wrote:
         | Even simpler: I'm using Glacier Deep Archive for my personal
         | backups, and I don't see how R2 would be cheaper for me.
        
           | telgareith wrote:
           | Have you priced the retrieval cost? You quickly run into high
           | 3 and then 4 figures worth of bandwidth.
        
             | philistine wrote:
             | Retrieval? For an external backup? If I need to restore and
             | my local backup is completely down, it either means I lost
             | two drives (very unlikely) or the house is a calcinated
             | husk and at this point I'm insured.
             | 
             | And let's be honest. If the house burns down, the computers
             | are the third thing I get out of there after the wife and
             | the dog. My external backup is peace of mind, nothing more.
             | I don't ever expect to need it in my lifetime.
        
             | seized wrote:
             | Yes, but if it's your third location of 3-2-1 then it can
             | also make sense to weigh it against data recovery costs on
             | damaged hardware.
             | 
             | I backup to Glacier as well. For me to need to pull from it
             | (and pay that $90/TB or so) means I've lost more than two
             | drives in a historically very reliable RAIDZ2 pool, or lost
             | my NAS entirely.
             | 
             | I'll pay $90/TB over unknown $$$$ for a data recovery from
             | burned/flooded/fried/failed disks.
        
             | kiwijamo wrote:
             | High 3 and 4 figures wouldn't occur for personal backups
             | though. I've done a big retrieval once and the cost was
             | literally just single digits dollars for me. So the total
             | lifetime cost (including retrievals) is cheaper on S3 than
             | R2 for my personal backup use case. This is why I struggle
             | to take seriously any analysis that says S3 is expensive --
             | it is only expensive if you use the most expensive
             | (default) S3 product. S3 has more options to offer than
             | than R2 or other competitors which is why I stay with S3
             | and pay <$1.00 a month for my entire backup. Most
             | competitors (including R2) would have me pay significantly
             | more than I spend on the appropriate S3 product.
        
           | Dylan16807 wrote:
           | I think the most reasonable way to analyze this puts non-
           | instant-access Glacier in a separate category from the rest
           | of S3. R2 doesn't beat it, but R2 is not a competitor in the
           | first place.
        
       | bassp wrote:
       | Only tangentially related to the article, but I've never
       | understood _how_ R2 offers 11 9s of durability. I trust that S3
       | offers 11 9s because Amazon has shown, publicly, that they care a
       | ton about designing reliable, fault tolerant, correct systems (eg
       | Shardstore and Shuttle)
       | 
       | Cloudflare's documentation just says "we offer 11 9s, same as
       | S3", and that's that. It's not that I don't believe them but...
       | how can a smaller organization make the same guarantees?
       | 
       | It implies to me that either Amazon is wasting a ton of money on
       | their reliability work (possible) or that cloudflare's 11 9s
       | guarantee comes with some asterisks.
        
         | rat9988 wrote:
         | What makes you think it did cost aws that much moneu at their
         | scale to achieve 11 9s that cloudflare cannot afford it?
        
           | bassp wrote:
           | Minimally, the two examples I cited: Shardstore and Shuttle.
           | The former is a (lightweight) formally verified key value
           | store used by S3, and the latter is a model checker for
           | concurrent rust code.
           | 
           | Amazon has an entire automated reasoning group (researchers
           | who mostly work on formal methods) working specifically on
           | S3.
           | 
           | As far as I'm aware, nobody at cloudflare is doing similar
           | work for R2. If they are, they're certainly not publishing!
           | 
           | Money might not be the bottleneck for cloudflare though,
           | you're totally right
        
             | zild3d wrote:
             | S3 has touted 11 9's for many years, so before shardstore
             | definitely.
             | 
             | The 11 9's is for durability, which is really more about
             | the redundancy setup, erasure coding, etc.
             | (https://cloud.google.com/blog/products/storage-data-
             | transfer...)
             | 
             | fwiw availability is 4 9's
             | (https://aws.amazon.com/s3/storage-classes/)
        
               | bassp wrote:
               | That's a good point!
               | 
               | I think I overstated the case a little, I definitely
               | don't think automated reasoning is some "secret
               | reliability sauce" that nobody else can replicate; it
               | does give me more confidence that Amazon takes
               | reliability very seriously, and is less likely to ship a
               | terrible bug that messes up my data.
        
       | cube2222 wrote:
       | R2 and its pricing is quite fantastic.
       | 
       | We're using it to power the OpenTofu Provider&Modules
       | Registry[0][1] and it's honestly been nothing but a great
       | experience overall.
       | 
       | [0]: https://registry.opentofu.org
       | 
       | [1]: https://github.com/opentofu/registry
       | 
       | Disclaimer: CloudFlare did sponsor us their business plan so we
       | got access to higher-tier functionality
        
       | deanCommie wrote:
       | The innovator's dilemma is really interesting.
       | 
       | Whenever a new incumbent gets on the scene offering the same
       | thing as some entrenched leader only better, faster, and cheaper,
       | the standard response is "Yeah but it's less reliable. This may
       | be fine for startups but if you're
       | <enterprise|government|military|medical|etc>, you gotta stick
       | with the tried tested and true <leader>"
       | 
       | You see this in almost every discussion of Cloudflare, which
       | seems to be rapidly rebuilding a full cloud, in direct
       | competition with AWS specifically. (I guess it wants to be
       | evaluated as a fellow leader, not an also-ran like GCP/Azure
       | fighting for 2nd place)
       | 
       | The thing is, all the points are right. Cloudflare IS different -
       | by using exclusively edge networks and tying everything to CDNs,
       | it's both a strength and a weakness. There's dozens of reasons to
       | be critical of them and dozens more to explain why you'd trust
       | AWS more.
       | 
       | But I can't help but wonder that surely the same happened (i
       | wasn't on here, or really tech-aware enough) when S3 and EC2 came
       | on the scene. I'm sure everyone said it was unreliable,
       | uncertain, and had dozens of reasons why people should stick with
       | (I can only presume - VMWare, IBM, Oracle, etc?)
       | 
       | This is all a shallow observation though.
       | 
       | Here's my real question, though. How does one go deeper and
       | evaluate what is real disruption and what is fluff. Does
       | Cloudflare have something that's unique and different that
       | demonstrates a new world for cloud services I can't even imagine
       | right now, as AWS did before it. Or does AWS have a durable
       | advantage and benefits that will allow it to keep being #1
       | indefinitely? (GCP and Azure, as I see it, are trying to compete
       | on specific slices of merit. GCP is all-in on 'portability',
       | that's why they came up with Kubernetes to devalue the idea of
       | any one public cloud, and make workloads cross-platform across
       | all clouds and on-prem. Azure seems to be competitive because of
       | Microsoft's otherwise vertical integration with
       | business/windows/office, and now AI services).
       | 
       | Cloudflare is the only one that seems to show up over and over
       | again and say "hey you know that thing that you think is the best
       | cloud service? We made it cheaper, faster, and with nicer
       | developer experience." That feels really hard to ignore. But also
       | seems really easy to market only-semi-honestly by hand-waving
       | past the hard stuff at scale.
        
         | everfrustrated wrote:
         | Cloudflares architecture is driven purely by their history of
         | being a CDN and trying to find new product lines to generate
         | new revenue streams to keep share price up.
         | 
         | You wouldn't build a cloud from scratch in this way.
        
         | youngtaff wrote:
         | Maybe Cloudflare will even be profitable in the next year or
         | two...
        
       | orf wrote:
       | My experience: I put parquet files on R2, but HTTP Range requests
       | were failing. 50% of the time it would work, and 50% of the time
       | it would return _all_ of the content and not the subset
       | requested. That's a nightmare to debug, given that software
       | expects it to work consistently or not work at all.
       | 
       | Seems like a bug. Had to crawl through documentation to find out
       | the only support is on Discord (??), so I had to sign up.
       | 
       | Go through some more hoops and eventually get to a channel where
       | I received a prompt reply: it's not an R2 issue, it's "expected
       | behaviour due to an issue with "the CDN service".
       | 
       | I mean, sure. On a technical level. But I shoved some data into
       | your service and basic standard HTTP semantics where
       | intermittently not respected: that's a bug in your service, even
       | if the root cause is another team.
       | 
       | None of this is documented anywhere, even if it is "expected".
       | Searching for [1] "r2 http range" shows I'm not the only one
       | surprised
       | 
       | Not impressed, especially as R2 seems ideal for serving Parquet
       | data for small projects. This and the janky UI plus weird
       | restrictions makes the entire product feel distinctly half
       | finished and not a serious competitor.
       | 
       | 1.
       | https://www.google.com/search?q=r2+http+range&ie=UTF-8&oe=UT...
        
         | saurik wrote:
         | > given that software expects it to work consistently or not
         | work at all
         | 
         | I mean... that's wrong? If you come across such software, do
         | you at least file a bug?
        
           | orf wrote:
           | Of course not, and it's completely correct behaviour: if a
           | server advertises it supports Range requests for a given URL,
           | it's expected to _support_ it. Garbage in, garbage out.
           | 
           | It's not clear how you'd expect to handle a webserver trying
           | to send you 1Gb of data after you asked for a specific 10kb
           | range other than aborting.
        
             | saurik wrote:
             | "Conversely, a client MUST NOT assume that receiving an
             | Accept-Ranges field means that future range requests will
             | return partial responses. The content might change, the
             | server might only support range requests at certain times
             | or under certain conditions, or a different intermediary
             | might process the next request." -- RFC 9110
        
               | orf wrote:
               | Sure, but that's utterly useless in practice because
               | there is no way to handle that gracefully.
               | 
               | To be clear: most software does handle it, because it
               | detects this case and aborts.
               | 
               | But to a user who is explicitly asking to read a parquet
               | file without buffering the entire file into memory, there
               | is no distinction between a server that cannot handle any
               | range requests and a server that can occasionally handle
               | range requests.
               | 
               | Other than one being much, much more annoying.
        
       | MobileVet wrote:
       | One thing that I haven't seen discussed in the comments is the
       | inherent vulnerability of S3 pricing. Like all things AWS, if
       | something goes sideways, you are suddenly on the wrong side of a
       | very large bill. For instance, someone can easily blow your
       | egress charges through the roof by making a massive amount of
       | requests for your assets hosted there.
       | 
       | While Cloudflare may reach out and say 'you should be on
       | enterprise' when that happens on R2, the fact they also handle
       | DDoS and similar attacks as part of their offering means the
       | likelihood of success is much lower (as is the final bill).
        
         | sroussey wrote:
         | Cloudflare has DDOS roots and it plays well here.
        
         | akira2501 wrote:
         | Typically you would use S3 with CloudFront for hosting. S3
         | provides no protections because it's meant to be a durable and
         | global service. CloudFront provides DDoS and other types of
         | protection while making it easy to get prepaid bandwidth
         | discounts.
        
           | danielheath wrote:
           | Just one data point, but adding Cloudflare to our stack (in
           | front of "CloudFront with bandwidth discounts") took about
           | $30k USD per year off our bandwidth bill.
        
         | lopkeny12ko wrote:
         | I'm not really sure what point you're trying to make here. S3
         | bills you on, essentially, serving files to your customers. So
         | yes if your customers download more files then you get charged
         | more. What exactly is the surprising part here
        
           | zaptheimpaler wrote:
           | The surprise is any ne'er-do-well can DDoS your bucket even
           | if they aren't a customer. Genuine customer traffic volume
           | will probably be known and expected, but putting an S3 bucket
           | in the open is something like leaving a blank check on the
           | internet.
        
             | lopkeny12ko wrote:
             | It's a bit unfair to characterize that as a surprise on how
             | much S3 bills you, no? The surprising part here is lack of
             | DDoS protection on your end or leaving a bucket public and
             | exposed. AWS is just charging you for how much it served,
             | it doesn't make sense to hold them to a fault here.
        
               | fnikacevic wrote:
               | AWS will also forgive mistakes or negligence based bills,
               | in my case 3 times.
        
               | bobthebutcher wrote:
               | If you want to hire someone to walk your dog you probably
               | won't put an ad in the New york times to a head hunter
               | that you will pay by the hour with no oversight and it
               | would be totally unfair to that head hunter when you
               | don't want to pay them for the time of all those
               | interviews. But an infinitely scalable service you
               | somehow can't put immediately terminal limits on is
               | somehow fine on the cloud.
        
               | bippihippi1 wrote:
               | it loses trust with customers when the simple setup is
               | flawed. S3 is rightly built to support as much egress as
               | any customer would want, but wrong to make it complex to
               | set up rules to limit the bandwidth and price.
               | 
               | It should be possible to use the service, especially
               | common ones like S3 with little knowledge of architecture
               | and stuff.
        
               | Dylan16807 wrote:
               | > The surprising part here is lack of DDoS protection on
               | your end or leaving a bucket public and exposed.
               | 
               | It doesn't take anything near DDoS. If you dare to put up
               | a website that serves images from S3, and one guy on one
               | normal connection decides to cause you problems, they can
               | pull down a hundred terabytes in a month.
               | 
               | Is serving images from S3 a crazy use case? Even if have
               | signed and expiring URLs it's hard to avoid someone
               | visiting your site every half hour and then using the URL
               | over and over.
               | 
               | > AWS is just charging you for how much it served, it
               | doesn't make sense to hold them to a fault here.
               | 
               | Even if it's not their fault, it's still an "inherent
               | vulnerability of S3 pricing". But since they charge so
               | much per byte with bad controls over it, I think it does
               | make sense to hold them to a good chunk of fault.
        
               | zaptheimpaler wrote:
               | I don't know about fair or unfair, but it's just a
               | problem you don't have to worry about if there's no
               | egress fees.
        
           | karmakaze wrote:
           | There was a backlash about being billed for unauthorized
           | requests. It's since been updated[0]. I don't know that all
           | affected was retroactively refunded.
           | 
           | [0] https://aws.amazon.com/about-aws/whats-
           | new/2024/08/amazon-s3...
        
         | jonathannorris wrote:
         | Also, once you are on Enterprise, they will not bug/charge you
         | for contracted overages very often (like once a year) and will
         | forgive significant overages if you resolve them quickly, in my
         | experience.
        
       | viraptor wrote:
       | IAM gets only a tiny mention as not present, therefore making R2
       | simpler. But also... IAM is missing and a lot of interesting use
       | cases are not possible there. No access by path, no 2fa
       | enforcing, no easy SSO management, no blast radius limits - just
       | would you like a token which can write a file, but also delete
       | everything? This is also annoying for their zone management for
       | the same reason.
        
       | pier25 wrote:
       | > _you can 't chose the location of your R2 bucket!_
       | 
       | Yeah this is really annoying. That and replication to multiple
       | regions is the reason we're not using R2.
       | 
       | Global replication was a feature announced in 2021 but still
       | hasn't happened:
       | 
       | > _R2 will replicate data across multiple regions and support
       | jurisdictional restrictions, giving businesses the ability to
       | control where their data is stored to meet their local and global
       | needs._
       | 
       | https://www.cloudflare.com/press-releases/2021/cloudflare-an...
        
       | tlarkworthy wrote:
       | I've benchmarked R2 and S3 and S3 is well ahead in terms of
       | latency _especially_ on ListObject requests. I think R2 has come
       | kind of concurrency limit as concurrent ListObject requests seem
       | to to have increase failure rate when serving simultaneous
       | requests
       | 
       | I have a few of the S3-like wired up live over the internet you
       | can try yourself in your browser. Backblaze is surprisingly
       | performant which I did not expect (S3 is still king though)
       | 
       | https://observablehq.com/@tomlarkworthy/mps3-vendor-examples
        
       | postatic wrote:
       | I do mostly CRUD apps with Laravel and Vue. Nothing too
       | complicated. Allows users to post stuff with images and files.
       | I've moved ALL of my files from S3 to R2 in the past 2 years.
       | It's been slow as any migrations are but painless.
       | 
       | But most importantly for an indie dev like me the cost became $0.
        
       | snihalani wrote:
       | >Generally, R2's user experience is way better and simpler than
       | S3. As always with AWS, you need 5 certifications and 3 months to
       | securely deploy a bucket.
       | 
       | +1
        
       | asteroidburger wrote:
       | "Here's a bunch of great things about CloudFlare R2 - and please
       | buy my book about it" leaves a bad taste in my mouth.
       | 
       | Also, has CF improved their stance around hosting hate groups?
       | They have strongly resisted pressure to stop hosting/supporting
       | hate sites like 8chan and Kiwifarms, and only stopped
       | reluctantly.
        
         | gjsman-1000 wrote:
         | I don't have to support 8chan or KiwiFarms to say that
         | Cloudflare has absolutely no role in policing the internet. The
         | job of policing the internet is for the police. If it's
         | illegal, let them investigate.
        
           | asteroidburger wrote:
           | There is a difference between policing the internet and
           | supplying resources and services to known bad actors.
           | 
           | Their job isn't to investigate and punish harassment and
           | criminal behavior, but they certainly don't have to condone
           | it via their support.
        
             | gjsman-1000 wrote:
             | > known bad actors
             | 
             | If they are known bad actors, let the police do the job of
             | policing the internet. Otherwise, all bad actors are
             | ultimately arbitrarily defined. _Who_ said they are known
             | bad actors? What does that even mean? _Why_ does that
             | person determining bad actors get their authority? Were
             | they duly elected? Or did one of hundreds of partisan NGOs
             | claim this? Who elected the NGO? Does PETA get a say on bad
             | actors?
             | 
             | Be careful what you wish for. In some US States, I am sure
             | the attorney general would send a letter saying to shut
             | down the marijuana dispensary - they're known bad actors,
             | after all. They might not win a lawsuit, but winning the
             | support of private organizations would be just as good.
             | 
             | > they certainly don't have to condone it via their support
             | 
             | Wow, what a great argument. Hacker News supports all
             | arguments here by tolerating people speaking and not
             | deleting everything they could possibly disagree with.
             | 
             | Or maybe, providing a service to someone, should not be
             | seen as condoning all possible uses of the service. Just
             | because water can be used to waterboard someone, doesn't
             | mean Walmart should be checking IDs for water purchasers.
             | Just because YouTube has information on how to pick locks,
             | does not mean YouTube should be restricted to adults over
             | 21 on a licensed list of people entrusted with lock-picking
             | knowledge.
        
       | karmakaze wrote:
       | At one company we were uploading videos to S3 and finding a lot
       | of errors or stalls in the process. That led to evaluating GCP
       | and Azure. I found that Azure had the most consistent (least
       | variance) in upload durations and better pricing. We ended up
       | using GCP for other reasons like resumable uploads (IIRC). AWS
       | now supports appending to S3 objects which might have worked to
       | avoid upload stalls. CloudFront for us at the time was
       | overpriced.
        
       | kansi wrote:
       | I have tried to find a CDN provider which would offer access
       | control similar to Cloudfront's signed cookies but failed to find
       | something that would match it. This is a major drawback with
       | these providers offering S3 style bucket storage because most of
       | time you would want to serve the content from a CDN and
       | offloading access control to CDN via cookies makes life so much
       | easier. You only need to set the cookies for the user's session
       | once and they are automatically sent (by the web browser) to the
       | CDN with no additional work needed
        
       | denysvitali wrote:
       | No mention of Backblaze's B2? It's cheaper than these two at just
       | 6$/TB
        
       ___________________________________________________________________
       (page generated 2024-11-27 23:01 UTC)