[HN Gopher] Cloudflare R2 IA storage tier
___________________________________________________________________
Cloudflare R2 IA storage tier
Author : aofeisheng
Score : 139 points
Date : 2024-04-03 13:08 UTC (9 hours ago)
(HTM) web link (blog.cloudflare.com)
(TXT) w3m dump (blog.cloudflare.com)
| nitsky wrote:
| Is the "data processing fee" any different from an egress fee in
| practice? Seems a little deceptive.
| dannyw wrote:
| Yes. You can process it once to the standard tier, and egress
| as much as you want for free.
|
| The differences stack up for say, a 1GB video that becomes
| viral and triggers terabytes in egress. You pay for 1GB, not
| terabytes.
|
| It's also an optional tier.
| benterix wrote:
| > The differences stack up for say, a 1GB video that becomes
| viral and triggers terabytes in egress. You pay for 1GB, not
| terabytes.
|
| Under the condition that you actively monitor the usage and
| manage to "process it once" on time (and then "process it
| back"). Because otherwise you pay for terabytes - not in
| egress fees, but in processing fees. Or am I missing
| something?
| Guzba wrote:
| The whole point of IA is cheaper storage that is
| infrequently accessed, and there is a price to accessing
| it. If you need / want frequent access just use the regular
| storage class.
|
| All object stores out there have a flavor of IA class with
| an access fee that should be far lower than the storage
| class savings for scenarios where you would even consider
| using this. If you don't want or understand this cost
| optimization you simply don't use it.
| ericpauley wrote:
| Yes, because in a well-designed setup files that are frequently
| accessed would be restored to standard tier. Ideally you'd only
| pay the data processing fee once when files transition from
| infrequently accessed to frequently accessed. There's a
| breakeven point at a data access rate of once every two months.
| CharlesW wrote:
| Maybe the cold-to-hot migration "tax" is partially to prevent
| abuse?
|
| > _" Data retrieval is charged per GB when data in the
| Infrequent Access storage class is retrieved and is what allows
| us to provide storage at a lower price. It reflects the
| additional computational resources required to fetch data from
| underlying storage optimized for less frequent access."_
|
| I like the "automatic storage classes" idea as well.
|
| > _" ...you can define an object lifecycle policy to move data
| to Infrequent Access after a period of time goes by and you no
| longer need to access your data as often. In the future, we
| plan to automatically optimize storage classes for data so you
| can avoid manually creating rules and better adapt to changing
| data access patterns."_
| dmw_ng wrote:
| AWS already give you intelligent tiering for this, it's a
| very nice product but it's also just a nice way of hiding the
| same fees. Your $0.004/GB becomes $0.023/GB on first read for
| 1 month then $0.0125/GB for 2 months, so the average cost of
| storing it over those 3 months becomes $0.016/GB, and that's
| before considering monitoring fees
| sborsje wrote:
| You could also implement tiering yourself, depending on
| your workload of course. If you know you're storing objects
| for long-term archival reasons (or backups), you could opt
| for using S3 Glacier Instant Retrieval at $0.004/GB.
| dmw_ng wrote:
| At least magnetic disks are iops constrained, lower iops loads
| conceivably allow higher density, or packing different load
| patterns to the same devices. Say a 8 TB / 100 iops disk
| reserves 90 iops for a 1 TB a database service, that's 87% of
| the disk's capacity sitting free but only 10 iops to serve it
| with. Adding what is effectively an iops tax to discourage
| frequent reads is one way to make a mixture like this work (or
| another way to think of it - subtracting an iops discount)
|
| Obviously example above is contrived, but same principle
| applies to a pool of 1000 disks as it would 1. You also don't
| escape this issue with regular hot storage either, there is
| still a (((iops * replication count) / average traffic) / max
| latency) type problem lurking, which would still necessitate
| either limiting density or increasing redundancy according to
| expected IO rate. This is one reason why some S3 alternatives
| with weaker latency bounds (not naming names, they're great but
| it's just not the same service) can often be made substantially
| cheaper, and why at least one of S3's storage classes may be
| implemented entirely as an accounting trick with no data
| movement or hardware changes at all
| rc_mob wrote:
| been waiting for that event trigering for a loooong time. I'll
| give it a go
| aftbit wrote:
| How is data stored in this tier? Is it just on big slow SMR
| disks?
| tills13 wrote:
| Anecdotally, we have found R2 to be nearly half as slow to
| respond as the same request to S3 _proxied through Cloudflare_.
|
| So... something isn't right here. Maybe a mechanical turk where
| a live human is fetching the object using Windows Explorer
| behind the scenes?
| djhn wrote:
| Any chance you could convert 'nearly half as slow' to a
| percentage of the original response time? This reads like a
| reading comprehension puzzle and I don't even know if my
| hunch is correct.
| SahAssar wrote:
| Did you mean "half as fast"? The way you worded it sounds
| like you mean to say that s3 is faster, but it says that r2
| is twice as fast as s3.
| Lucasoato wrote:
| Does anybody know which consistency model Cloudflare offers
| compared to AWS S3?
| mattdeboard wrote:
| Here are Cloudflare's docs on R2 consistency model
| https://developers.cloudflare.com/r2/reference/consistency/
| gavinsyancey wrote:
| Backblaze B2 is cheaper than Cloudflare R2 IA. Hmmm
| iscoelho wrote:
| This is expected. Like cloud providers, Cloudflare is
| intentionally not aggressive on pricing as it is a race to the
| bottom.
|
| There are other ways to compete.
| Guzba wrote:
| B2 can be great but it is missing a lot of features when
| compared to other object stores so it isn't a good solution for
| every scenario.
|
| As an example I investigated, to put a custom domain in front
| of a B2 bucket they suggest using Cloudflare and CNAME-ing a
| bucket subdomain (eg f000.backblazeb2.com)
| https://www.backblaze.com/docs/cloud-storage-deliver-public-...
|
| Well if f000.backblazeb2.com is used for any other people's
| buckets too, which appears to be the case, I guess I am now
| able to serve other people's files from my domain? This seems
| terrible.
| jdmarble wrote:
| I'm not sure I understand all of the nuances here (I'm no
| webmaster), but this is covered in the documentation you
| linked:
|
| > You must configure page rules to allow Cloudflare to fetch
| only your Backblaze B2 bucket from your domain. ...
| Otherwise, someone could use your domain to fetch content
| from another customer's public bucket. To ensure this does
| not happen, Cloudflare lets you use page rules to scope
| requests to your bucket.
| Guzba wrote:
| The example shows leaving your bucket name in the url as a
| way to filter out requests to other bucket names. If you
| want your static site to have
| http://mysite.com/bucketname/index.html then I guess that's
| ok. But again, careful configuration and still not for
| every situation.
|
| I'm sure you can layer more rules to get it exactly right
| but I'd not be eager to layer on complex configuration
| through multiple service providers when it is avoidable,
| unless there is some very compelling overriding reason.
| gfs wrote:
| As far as I know, bucket names must be unique at other
| providers like AWS as well. [0]
|
| I'm no expert but to try and protect my own domain, I use
| a transform rule to match a subdomain and append
| "/file/$MY_BUCKET_NAME" to each request. This should
| return a 404 for anybody who tries to inject their own
| bucket in the path. I could be wrong of course.
|
| [0]: https://docs.aws.amazon.com/AmazonS3/latest/userguid
| e/bucket...
| eropple wrote:
| Bolting a Cloudfront distribution onto a S3 bucket is
| pretty well-trod territory, though, and doesn't have
| these sharp edges. (Has a couple other ones, but they're
| less common.)
| johnmaguire wrote:
| This is an easily solved problem. Backblaze has an
| example here:
| https://github.com/backblaze-b2-samples/cloudflare-b2
| Guzba wrote:
| Does the solution involve using Cloudflare workers?
| Because, as I said, I'm sure it is possible but maybe
| we've gone off the deep end a bit. Just how crazy of a
| configuration do you want just to serve files from an
| object store?
|
| This looks like an awful lot of setup for "easily
| solved". Easily solved is what S3 does where this isn't
| even a problem.
| silverlyra wrote:
| curl -sL https://github.com/backblaze-b2-samples/cloudfla
| re-b2/raw/main/README.md | head -n 1 # Cloudflare
| Worker for Backblaze B2
|
| yes, it's a worker
| nchmy wrote:
| and iDrive e2 is even cheaper.
| https://www.idrive.com/s3-storage-e2/
| jjtheblunt wrote:
| Isn't iDrive a BMW trademark for decades?
|
| https://en.wikipedia.org/wiki/BMW_iDrive
| sp332 wrote:
| Trademarks only apply to specific classes of things.
| https://www.uspto.gov/trademarks/basics/goods-and-services
| jjtheblunt wrote:
| That's useful: thanks.
| jjtheblunt wrote:
| Downvoted for a question? How laughably degenerate.
| asmor wrote:
| iDrive is slow and charges extremely high fees for usage
| above provisioned. They also have a history of increasing
| prices for individual contracts only a few weeks before
| renewal, so you can't possibly have enough time to move your
| data.
|
| Hetzner Storage Boxes (2.50-3 EUR per TB) is probably the
| sweet spot. B2 if you need an object storage API.
| vineyardmike wrote:
| Do storage boxes have any availability risk? "Storage box"
| sounds like you have an actual VM with an attached spinning
| disk, which doesn't seem tolerant to hardware failures. I
| couldn't find any details on their website about this.
| asmor wrote:
| They're on triple redundancy Ceph as far as I know, not
| geo redundant, but at that price you can buy one in
| Germany and one in Finland and still come out cheaper
| than B2.
|
| You also only get a very locked down shell.
| ashconnor wrote:
| Rounded to the nearest TB rather than metered.
| ac29 wrote:
| Backblaze is unprofitable and publicly traded, a combination
| which cant last forever. They raised B2 prices 20% last year, I
| wouldnt be surprised to see more increases if they continue to
| burn through cash.
| internetter wrote:
| > Backblaze is unprofitable and publicly traded
|
| So is Cloudflare
| sophacles wrote:
| The amount of profit isn't always the most important number
| anyway. A lot of companies choose to not be profitable
| while they can spend their bank account growing their
| business (Cloudflare is one, and Backblaze may be another
| but I have no idea about their finances, historically
| Amazon and Salesforce both did this too).
|
| If qoq and yoy revenue keeps going up, and cost of revenue
| stays the same or decreases (as a percentage) in the same
| time period, it makes sense to spend the bank account on
| growth. If the growth stops, that's when you start cutting
| expenses like R&D and operations to get the profit.
| Reasoning being: getting x% of a bigger revenue is better
| than getting x% of a smaller revenue.
| everforward wrote:
| Cloudflare practically has a stranglehold on the modern
| Internet. I would bet money they would be immediately
| profitable if they killed their free tier. There aren't
| that many competitors, and I don't know if they've got the
| spare capacity to absorb all the exodus.
|
| I'm not so sure about Backblaze. I don't even think they're
| the biggest player in that space (AWS is, I would guess). I
| would guess most people could migrate off if Backblaze
| turned south.
| no_wizard wrote:
| if they simply streamlined their sales pipeline and
| created a mid tier (somewhere between $500-$2500 a month
| for example) that unlocks some of the features that are
| behind "contact sales" banner they could boost revenue
| without changing any existing tiers, I'd wager.
|
| I think the platform has a ton of potential and it
| already shows signs of real progress, but much like
| fly.io, its rough edges are incredibly rough.
| tootie wrote:
| I don't work in enterprise sales or nothing, but it seems
| to me that businesses whose only price tag is "call us"
| are the ones with the most revenue. Transparent pricing
| is great for SMBs, but the big bucks are in making
| yourself entrenched in giant enterprises.
| iancarroll wrote:
| This exists but isn't really documented, we pay ~$1k/mo
| for a "light" enterprise version of Cloudflare.
| tootie wrote:
| I don't see how Cloudflare has a stranglehold. They've
| captured the bottom of the market by having low, low
| introductory prices and turnkey security. They have tons
| of huge and small competitors though. They have far less
| revenue than Akamai.
| ndiddy wrote:
| I'm not sure about Cloudflare. Their free tier vastly
| reduces the barrier to entry for setting up a DDoS as a
| service business (without it you'd need to have very
| expensive hardware for circumventing DDoS attacks, as
| otherwise you'd get DDoS'd by your competitors). This in
| turn increases the demand for Cloudflare's services to
| protect against DDoS attacks.
| everforward wrote:
| I'm not saying they're good for the internet, just that
| if I was going to make a bet on which one is more likely
| to survive a decade, it'd be Cloudflare.
| mattsan wrote:
| Cloudflare is profitable. Whales subsidise retail.
| kjkjadksj wrote:
| I can't get over the fact how storage is still so expensive in
| 2024. Lowest you can get is probably $5 per tb a month from any
| of these companies. A new tb hdd is probably $25 today. Where
| does the money go, into c suite car payments?
| fire_lake wrote:
| It's availability that's expensive.
| disillusioned wrote:
| Availability, durability, etc.
| Hamuko wrote:
| All those nines cost extra.
| godelski wrote:
| I really believe there's a missing market. I think there
| would be big business in building servers for homes. Where
| you prioritize low power usage and low noise, and do not need
| blazing speeds or frequent/high access. Cloud is great, but
| some things should still be stored at home. Home NAS systems
| are a bit odd and difficult to expand (nowhere near what a
| rack is).
|
| My argument would be that this would be helpful with the high
| adaptation of things like Ring doorbells and other camera
| systems at home. Where people can store their own data and
| provides better security & privacy given you need not rely on
| a data connection to store that footage. It also would be
| extremely worthwhile if we are to see personalized LLMs
| become common and tools like home assistant. You wouldn't
| want that running off-site. In fact, I'd rather call home
| from my mobile LLM than call FAANG (or anyone else with
| teeth).
|
| I just think buying used servers on ebay or trying to throw
| together a home rig is harder than it needs to be. I'm
| confident the demand exists but it is unfortunately a field
| of dreams scenario. Many people will not know they want it
| until it exists (I can say my parents would love this but
| they don't understand the first thing about technology so all
| they can do is complain about Google/Apple having all their
| data rather than express how they want to store their own).
| carlhjerpe wrote:
| The problem is tech illiteracy and CGNAT.
|
| The product must be "a router" so people can access it
| outside of home. Or it doesn't have to, but then you'll
| have to proxy traffic through you and charge for it.
|
| And your "router server" must have a decent AP, because the
| likelyhood people know how to bridge their "routeraps" is
| pretty low.
|
| IPv6 would help for sure, but there's still "allow 443 to
| this box", static registrations.
|
| This is before even building the product
| godelski wrote:
| I don't think you're exactly wrong, but I think it is too
| narrow.
|
| You're perfectly right that there is far too little tech
| literacy. Even with the example of my parents. But
| they're an example of someone who I think would
| especially benefit from this. Because they wouldn't get
| it out of their own desire, but because I their child
| would install it for them. Because I don't want to build
| and piece together everything. Because I'm used to the
| general tech support of them calling me up, and having to
| figure out literally everything on the fly because the
| only time I touch a Windows system is my yearly Christmas
| visit.
|
| I've ran a NAS in their home before and the reason they
| stopped is because they got a new router and "it broke."
| Prior to that I was able to ssh into their network
| because I had a pi laying around.
|
| But the problem you specify is not the problem you think
| it is. It is UI/UX. Many of these things can be set up
| automatically. The reason PGP is a disaster is because
| it's cumbersome to use. Google making it default and not
| having to think about it solved that. Signal, iMessage,
| and WhatsApp made encryption trivial for people who
| wouldn't have done it before because "it is too hard."
| I'm unconvinced this is anything different. Where if you
| take a family member only basically tech literate, can
| help them do the initial setup, and away you go. You just
| have to make it as easy as WhatsApp (or even a lot less),
| and I believe you could.
|
| I say this as someone who is a researcher and does a lot
| of backend programming. I know we give UI/UX people a lot
| of shit (and quite often they do deserve it. There are a
| lot of annoying useless changes), but they do also play a
| huge role in making technology accessible. Really, that
| is their main role. And truth be told, the environment
| has dramatically changed where now a days there's many
| custom distros that make things easier and even these
| days my Grandma can use Linux. There's definitely a
| hardware and backend problem here, but I'm actually
| convinced the biggest issue is design. Which, let's be
| real, is what made computers prolific in the first place.
| Springtime wrote:
| Edit: misunderstood your premise. You meant bespoke solely
| single household servers at home. Like homelabs but without
| the hassle.
|
| Wuala[1][2] did something similar more than a decade ago,
| in that users become distributed storage for other users
| which made the service free for those participating
| (otherwise was a paid subscription). They were then
| acquired and stopped their most unique feature before
| closing for good.
|
| [1] https://en.wikipedia.org/wiki/Wuala
|
| [2] https://arstechnica.com/uncategorized/2008/08/first-
| look-wua...
| aseipp wrote:
| Well, spinning rust HDDs stuck in your server have no actual
| parallelism and aren't highly available, replicated, etc.
|
| How much do you think 1TB of storage should cost?
| sophacles wrote:
| You're comparing apples and oranges here....
|
| A $25/TB drive is not the only expense that $5 goes towards:
|
| * there's actually probably 2 or more HDs holding that TB,
| since the business is promising that the data won't be lost
|
| * theres the computer(s) that hold that HD.
|
| * theres the electicity, bandwidth and space rental costs for
| those computers
|
| * theres the cost of employees to make sure that the
| computers keep running.
|
| * theres the cost of the marketing so that you know that the
| service is available
|
| * theres all the book-keeping, taxes, cc fees, etc that need
| to be paid on the recurring charge
|
| * there's (hopefully) profit for the investors/owners
|
| and so on.
|
| Also, on your side you should consider several of those
| factors yourself to do the comparrison:
|
| * how much do you consider the time spent managing your hdds
| to be worth? (if you're a business this is employee-hrs, if
| you're talking about for yourself privately, there's still a
| value you should attach to your own time)
|
| * do you have backups? If so, what does it cost to put them
| offsite? (In terms of space rental or favors traded, and your
| time)
|
| * electricity, etc
|
| * how much is it going to cost you to learn to reliably store
| your data (in terms of up-front cost, time spent, etc)
|
| * and of course hard drive costs
| RantyDave wrote:
| Backblaze have written many blog posts on how they go from a
| few thousand hard drives to a business. For me, the most
| interesting part was they went from six generations of self
| designed storage pods to "fuck it, just buy Dell". Long story
| short: these businesses are surprisingly complicated.
|
| https://www.backblaze.com/blog/next-backblaze-storage-pod/
| smileybarry wrote:
| IIRC Backblaze B2 charges for egress, while CloudFlare does
| not.
| vineyardmike wrote:
| They do not charge for egress either.
|
| https://www.backblaze.com/cloud-storage/landing/ad/use-
| cases...
| rattt wrote:
| *Up to 3x of average monthly data stored, then $0.01/GB for
| additional egress.
|
| Or free if you go through Cloudflare since they have the
| bandwidth aliance.
| historynops wrote:
| A fairly unrelated point, but its so strange how companies that
| underpin a lot of the internet struggle in the stock market.
| While we all wish we had sold our tech stocks in 2021, Cloudflare
| still hasn't recovered.
| Guzba wrote:
| I believe Cloudflare (and many other cos like it) have never
| produced operating income. They are growing and obviously
| important and potentially very profitable in the future, but
| when discount rates are much higher and you add in some
| uncertainty, one could argue they don't look as hot as they
| used to.
| simfree wrote:
| Cloudflare has a very dysfunctional sales pipeline. Their free,
| premium and self-serve offerings might underpin the internet,
| but the highly profitable offerings that are gated behind their
| sales teams are not getting sold. Too many of the clients that
| they should be selling to.
|
| Magic Transit (bring your own ASN), classic website DDoS
| protection (above the Business $200 tier, which has low,
| undisclosed data limits in regions like New Zealand) and ilk
| all require interacting with the sales rep, and unless your
| paying 5 figures a month they are disinterested.
|
| There is a whole market out there between $300 to $2000 a month
| that Cloudflare could tap without making new infrastructure but
| is actively being ignored.
| alphabettsy wrote:
| This.
|
| They lock a lot of features behind an Enterprise plan where
| they could allow them to be added to a lower plan.
|
| In general, I just hate working with sales reps and would
| rather avoid a company altogether if I can't sign up without
| talking to them.
| lijok wrote:
| Hit the nail on the head.
|
| Wanted to byt their SASE DLP & Remote Browser Isolation as a
| startup. Sales wouldn't even talk to us
| teruakohatu wrote:
| > undisclosed data limits in regions like New Zealand
|
| Can you please explain what this means?
| no_wizard wrote:
| Not to mention they have on multiple occasions made
| significant internal changes (including layoffs) to their
| sales organization. I have a feeling if the public were to
| get an introspection into their sales pipeline it would be
| eye opening, and not in a good way
| kjkjadksj wrote:
| It is bizzare. All the old guard foundations of society type
| companies that the world relies on for modernity have stocks
| that barely budge but pay out decent dividends. Maybe tech
| stocks that have grown to such a position should consider
| paying out dividends instead of failing to chase exponential
| stock price growth while still clearly doing a lot of
| productive things. I expect the shareholder boards prefer the
| chance of exponential wealth over steady returns and prevent
| this mindset from emerging.
| tschellenbach wrote:
| does anyone know which products of cloudflare have the most
| revenue?
| rc_mob wrote:
| probably the CFO of Cloudflare inc knows
| ricopags wrote:
| We're onboarding to Cloudflare MagicWan and want to use them for
| logging, which they do to 's3-compliant' buckets.... on Google or
| Amazon.
|
| I was pretty surprised at the lack of dogfooding, wondered if
| it's an oversight, on somebody's Gantt, or just not something R2
| can handle for some reason.
| viraptor wrote:
| Yeah, the integration and production readiness of their non-
| core offerings is not perfect. I'm dealing with R2 and another
| service and you can tell they fell more like... specifically
| integrated features, rather than fully modular services you can
| choose to use as you want. Like the workers have possible R2
| bindings, but you can't use those in a fetch() call - you have
| to use S3 compatible endpoint instead.
|
| AWS has its own issues, but the push to have everything talking
| over API did wonders for the ability to use them as you want.
| jakubadamw wrote:
| > Like the workers have possible R2 bindings, but you can't
| use those in a fetch() call - you have to use S3 compatible
| endpoint instead.
|
| Sorry, could you please elaborate? Why can you not use a
| binding to an R2 bucket - and perform operations on its
| objects - in a `fetch()` handler of a worker? Or did I
| misunderstand this statement?
| thrixton wrote:
| So pricing is 1c / GB-month, compared to S3 IA at 1.25c / GB-
| month, a decent saving but not massive, no archive or deep
| archive options though, I wonder if / when these will come.
|
| What sort of negotiated rates can you get from AWS for bandwidth
| I wonder, at the moment, that's seems like the only real benefit
| from CF I think.
___________________________________________________________________
(page generated 2024-04-03 23:01 UTC)