[HN Gopher] Backblaze Drive Stats for Q1 2024
___________________________________________________________________
Backblaze Drive Stats for Q1 2024
Author : TangerineDream
Score : 211 points
Date : 2024-05-02 13:25 UTC (9 hours ago)
(HTM) web link (www.backblaze.com)
(TXT) w3m dump (www.backblaze.com)
| GGO wrote:
| I buy hard drives based on these reports. Thank you Backblaze.
| Scene_Cast2 wrote:
| Where do you buy your drives? Last time I was in the market, I
| couldn't find a reputable seller selling the exact models in
| the report. I'm afraid that the less reputable sellers (random
| 3rd party sellers on Amazon) are selling refurbished drives.
|
| I ended up buying a similar sounding but not same model from
| CDW.
| cm2187 wrote:
| In europe lambda tek is my goto for enterprise hardware as a
| retail customer.
| secabeen wrote:
| These are useful data points, but I've found that at my risk
| tolerance level, I get a lot more TB/$ buying refurbished
| drives. Amazon has a couple of sellers that specialize in
| server pulls from datacenters, even after 3 years of minimal
| use, the vendors provide 5 years of additional warranty to
| you.
| pronoiac wrote:
| Buying refurbished also makes it much easier to avoid
| having the same brand/model/batch/uptime, for firmware and
| hardware issues. I do carefully test for bad sectors and
| verify capacity, just in case.
| WarOnPrivacy wrote:
| > even after 3 years of minimal use, the vendors provide 5
| years of additional warranty to you.
|
| The Amazon refurb drives (in this class) typically come
| with 40k-43k hours of data center use. Generally they're
| well used for 41/2-5yrs. Price is ~30% of new.
|
| I think refurb DC drives have their place (replaceable
| data). I've bought them - but I followed other buyers'
| steps to maximize my odds.
|
| I chose my model (of HGST) carefully, put it thru an
| intensive 24h test and check smart stats afterward.
|
| As far as the 5yr warranty goes, it's from the seller and
| they don't all stick around for 5 years. But they are
| around for a while -> heavy test that drive after purchase.
| dehrmann wrote:
| I think you're better off buying used and using the savings
| for either mirroring or off-site backup. I'd take two
| mirrored used drives from different vendors over one new
| drive any day.
| ethbr1 wrote:
| There was a Backblaze report a while ago that said,
| essentially, that most individual drives are either
| immediate lemons or run to warranty.
|
| If you buy used, you're avoiding the first form of
| failure.
| malfist wrote:
| A lot of those resellers do not disclose that the drive
| isn't new, even labeling the item as new.
|
| GoHardDrive is notorious for selling "new" drives with
| years of power on time. Neither Newegg nor Amazon seem to
| do anything about those sellers
| 2OEH8eoCRo0 wrote:
| Indeed- RAID used to stand for Redundant Array of
| _Inexpensive_ Disks. The point was to throw a bunch of
| disks together and with redundancy it didn 't matter how
| unreliable they were. Using blingy drives w/ RAID feels
| counter-intuitive- at least as a hobbyist.
| havaloc wrote:
| B&H has quite a few
| bee_rider wrote:
| I guess it isn't that surprising given the path the
| development took, but it is always funny to me that one of
| the most reputable consumer tech companies is a photography
| place.
| fallingsquirrel wrote:
| Similar to how the most popular online retailer is a
| bookstore. Successful businesses are able to expand and I
| wish B&H the best of luck on that path, we need more
| companies like them.
| squigz wrote:
| I'd rather companies stick to one thing and do it well,
| rather than expand into every industry out there and
| slowly creep into every facet of society.
|
| Like that bookstore that just happens to retail some
| stuff too.
| ssl-3 wrote:
| B&H seems to be pretty focused on techy things (and
| cameras of all sorts have always been techy things,
| though that corner of the tech market that has been
| declining for a long time now).
|
| When they branch out to selling everything including
| fresh vegetables, motor oil, and computing services, then
| maybe they might be more comparable to the overgrown
| bookstore.
| Modified3019 wrote:
| I definitely learn towards B&H for electronic things.
| It's quite a bit less "internet flea market" that Amazon
| often is.
| philistine wrote:
| _AWB &H alone is a Fortune 500 company_
| ghaff wrote:
| There used to be a much more distinct camera--and all rhe
| ancillary gear and consumables than there used to be.
| Though B&H still sells a ton of lighting and audio gear
| as well as printers and consumables for same.
|
| They sell other stuff too but they're still pretty photo
| and video-centric, laptops notwithstanding.
| Wistar wrote:
| I buy most, but not all, of my tech at B&H and have now for
| more than a decade. Especially peripherals.
| SoftTalker wrote:
| And for stuff like this, many companies will have an approved
| vendor, and you have to buy what they offer or go through a
| justification for an exception.
| nikisweeting wrote:
| Lots of good options here: https://diskprices.com/
| dsr_ wrote:
| Note that they list at least one vendor as selling "New"
| drives when they are not even close to being new.
| user_7832 wrote:
| What's the risk of buying Amazon & running a
| SMART/crystaldisk test?
| speedgoose wrote:
| I don't buy hard drives based on these reports. I buy SSDs and
| let my cloud providers deal with hard drives.
| bluedino wrote:
| > The 4TB Toshiba (model: MD04ABA400V) are not in the Q1 2024
| Drive Stats tables. This was not an oversight. The last of these
| drives became a migration target early in Q1 and their data was
| securely transferred to pristine 16TB Toshiba drives.
|
| That's a milestone. Imagine the racks that were eliminated
| bombcar wrote:
| > That's a milestone. Imagine the racks that were eliminated
|
| I'm imagining about 3/4ths ;)
| djbusby wrote:
| I'm imagining 4x capacity
| seabrookmx wrote:
| 3/4ths of the racks that had 4TB drives, assuming they didn't
| also expand capacity as part of this.
|
| But they run many drive types.
| toomuchtodo wrote:
| Perhaps not eliminated, but repurposed with fresh 16TB drives.
| And the power savings per byte stored!
| Dylan16807 wrote:
| Yeah, but just thinking about it reminds me how annoyed I am
| that they increased the B2 pricing by 20% last year.
|
| Right after launching B2, in late 2015, they made their post
| about storage pod 5.0, saying it "enabled" B2 at the $5/TB
| price, at 44 cents per gigabyte and a raw 45TB per rack unit.
|
| In late 2022 they posted about supermicro servers costing 20
| cents per gigabyte and fitting a raw 240TB per rack unit.
|
| So as they migrate or get new data, that's 1/5 as many servers
| to manage, costing about half as much per TB.
|
| It's hard to figure out how the profit margin wasn't _much_
| better, despite the various prices increases they surely had to
| deal with.
|
| The free egress based on data stored was nice, but the change
| still stings.
|
| Maybe I'm overlooking something but I'm not sure what it would
| be.
|
| In contrast the price increases they've had for their unlimited
| backup product have always felt fine to me. Personal data keeps
| growing, and hard drive prices haven't been dropping fast. Easy
| enough. But B2 has always been per byte.
|
| And don't think I'm being unfair and only blaming them because
| they release a lot of information. I saw hard drives go from
| 4TB to 16TB myself, and I would have done a similar analysis
| even if they were secretive.
| philistine wrote:
| Inflation. At the rate it went up the last couple of years, a
| 20% price increase to put them back on the right side of
| profits is more than probable.
| Moru wrote:
| Also a storage inflation on the users side. People have
| more data on bigger drives that wants a backup.
| Dylan16807 wrote:
| This is B2, the service that charges per byte. More data
| makes it _easier_ for them to profit.
| Dylan16807 wrote:
| Maybe I wasn't clear, but the hardware costs and the
| operation costs should all have dropped between 2x and 5x
| as a baseline before price increases.
|
| Inflation is not even close to that level.
|
| And those hardware costs already take into account
| inflation up through the end of 2022.
| justsomehnguy wrote:
| > e hardware costs and the operation costs should all
| have dropped between 2x and 5x
|
| That would work if they fully recouped the costs of
| obtaining _and running_ the drives, including racks,
| PSUs, cases, _drive and PSU replacements_ , control
| boards, datacenter/whatever costs, electricity, HVAC etc.
| _and_ generated a solid profit not only to buy all the
| new hardware but a new yacht for the owners too.
|
| But usually that is not how it works, because the nobody
| sane buys the hardware with the cash. And even if they
| have a new fancy 240TB/rack units, that doesn't mean they
| just migrated outright and threw the old ones ASAP.
|
| So while there is a 5x lower costs per U for the new rack
| unit, it doesn't translate to 5x lower cost of storage
| _for the sell_.
| rokkamokka wrote:
| I always click these every time they come up. Can't tell you how
| much I appreciate them releasing stats like this!
| sdwr wrote:
| Says the annual failure rate is 1.5%, but average time to failure
| is 2.5 years? Those numbers don't line up.
|
| Are most drives retired without failing?
| bombcar wrote:
| Obviously yes. At an AFR of 1.5% they'd have to have the drives
| run for (about) 67 years to have them all retire from failure.
|
| (in reality they'd probably have failure rates spike at some
| point, but the idea stands. And they explicitly said they
| retired a bunch of 4TBs)
| scottlamb wrote:
| > Are most drives retired without failing?
|
| I'd expect so given that HDDs are still having significant
| density advancements. After a while old drives aren't worth the
| power and sled/rack space that could be used for a higher
| capacity drive. And, yeah, it makes these statistics make more
| sense together.
|
| Edit: plus they are just increasing drive count so most drives
| haven't hit the time when they would fail or be retired...
| rovr138 wrote:
| They just retired 4TB ones.
|
| While they seem to get retired, it's not as quick as we'd
| think.
| codemac wrote:
| Drives have warranties, after which the manufacturer doesn't
| make any claims about it's durability. This could put your
| fleet at wild and significant risk if things start hitting a
| wall and failing en masse. You may not be able to repair away
| if as you're repairing the data you're copying to yet another
| dying drive.
|
| So you have usually a lifetime of drive tput and start/stop
| values you want to stay under, and depending on how accurate
| your data is for each drive you may push beyond the drive
| warranties. But you will generally stop before the drive
| actually fails.
| rsync wrote:
| "Are most drives retired without failing?"
|
| Yes, certainly.
|
| One can watch both SMART indicators as well as certain ZFS
| stats and catch a problem drive before it actually fails.
|
| I like to remove drives from zpools early because there is a
| common intermediate state they can fall into where they have
| not failed out but dramatically impact ZFS performance as they
| timeout/retry certain operations _thousands and thousands of
| times_.
| favorited wrote:
| What's the best way to monitor those ZFS stats? I just rely
| on scheduled ZFS scrubs, and the occasional `zpool status
| -v`...
| mnw21cam wrote:
| Yes, and because of that the numbers on the average time to
| failure are completely meaningless. The drives the don't ever
| fail skew the numbers completely. If a fantastically reliable
| drive were to have 5/5000 drives fail, but they all failed in
| the first month and then the rest carried on forever, then that
| would show here as having a lower "reliability" than a dire
| drive where 4000/5000 drives fail after a year.
|
| I'd like to see instead something like mean time until 2% of
| the drives fail. That'd actually be comparable between drives.
| And yes, it would also mean that some drive types haven't
| reached 2% failure yet, so they'd be shown as ">X months".
|
| This is what a Kaplan-Meier survival curve was meant for [0].
| Please use it.
|
| Also, it'd be great to see the confidence intervals on the
| annualised failure rates.
|
| [0]
| https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator
| kayson wrote:
| Does Backblaze ever buy refurbs? I'm guessing not, but I'd be
| curious to see any data on how failure rates compare after
| manufacturers recertify.
| jjeaff wrote:
| I can't think of any reason why the lifetime would be any
| different for a refurb. of course, you need to start from when
| the drive was originally used. of course, there is probably
| also some additional wear and tear just due to the removal,
| handling, and additional shipping of the drives.
| from-nibly wrote:
| In some ways that would be incredibly noisy to test. However it
| could be a good way to measure the practicality of S.M.A.R.T
| metrics. Finding out how accurate they are at predicting hdd
| lifespan would be a great finding.
| SoftTalker wrote:
| Does anyone find value in SMART metrics?
|
| In my experience, the drives report "healthy" until they
| fail, then they report "failed"
|
| I've personally never tracked the detailed metrics to see if
| anything is predictive of impending failure, but I've never
| seen the overall status be anything but "healthy" unless the
| drive had already failed.
| Sohcahtoa82 wrote:
| The SMART metrics aren't binary, and any application that
| is presenting them as binary (Either HEALTHY or FAILED) is
| doing you a disservice.
|
| > I've personally never tracked the detailed metrics to see
| if anything is predictive of impending failure
|
| Backblaze has!
|
| https://www.backblaze.com/blog/hard-drive-smart-stats/
| SoftTalker wrote:
| From that link:
|
| From experience, we have found the following five SMART
| metrics indicate impending disk drive failure:
| SMART 5: Reallocated_Sector_Count. SMART 187:
| Reported_Uncorrectable_Errors. SMART 188:
| Command_Timeout. SMART 197:
| Current_Pending_Sector_Count. SMART 198:
| Offline_Uncorrectable.
|
| That's good to know, I might start tracking that. I
| manage several clusters of servers and hard drive
| failures just seem pretty random.
| vel0city wrote:
| I've had several hard drives that started gradually
| increasing a reallocated sector count, then start getting
| reported uncorrectable errors, then eventually just give
| up the ghost. Usually whenever reallocated sectors starts
| climbing a drive is nearing death and should be replaced
| as soon as possible. You might not have had corruption
| _yet_ , but its coming. Once you get URE's you've lost
| some data.
|
| However, one time a drive got a burst of reallocated
| sectors, it stabilized, then didn't have any problems for
| a long time. Eventually it wouldn't power on years later.
| favorited wrote:
| I've had an M.2 NVMe drive start reporting bad blocks via
| SMART. I kept using it for non-critical storage, but
| replaced it as my boot drive. Obviously not the same
| failure pattern as spinning rust, but I was glad for the
| early warning anyway.
| Sakos wrote:
| Absolutely. I've looked at the SMART data of easily over
| 1000 drives. Many of them ok, many of them with
| questionable health, many failing and many failed. The
| SMART data has always been a valuable indicator as to
| what's going on. You need to look at the actual values
| given by tools like smartctl or CrystalDiskInfo. Everything
| you need to evaluate the state of your drives is there.
|
| I've never seen an HDD fail overnight without any
| indication at all.
| jpgvm wrote:
| Amazing these have continued. I base my NAS purchase decisions on
| these and so far haven't led me astray.
| objektif wrote:
| Which specific ones do you like so far?
| Marsymars wrote:
| How _would_ they lead you astray? I wouldn 't consider a drive
| failure in a home NAS to indicate that - even their most
| statistically reliable drives still require redundancy/backup -
| if you haven't experienced a drive failure yet, that's just
| chance.
| MarkG509 wrote:
| I, too, love Backblaze's reports. But they provide no information
| regarding drive endurance. While I became aware of this with
| SSDs, HDD manufacturers are reporting this too, usually as a
| warranty item, and with surprisingly lower numbers than I would
| have expected.
|
| For example, in the Pro-sumer space, both WD's Red Pro and Gold
| HDDs report[1] their endurance limit as 550TB/year total bytes
| "transferred* to or from the drive hard drive", regardless of
| drive size.
|
| [1] See Specifications, and especially their footnote 1 at the
| bottom of the page:
| https://www.westerndigital.com/products/internal-drives/wd-r...
| wtallis wrote:
| The endurance figures for hard drives are probably derived from
| the rated number of seek operations for the heads, which is why
| it doesn't matter whether the operations are for reading or
| writing data. But that bakes in some assumptions about the mix
| of random vs sequential IO. And of course the figures are
| subject to de-rating when the company doesn't want the warranty
| to cover anything close to the real expected lifespan,
| especially for products further down the lineup.
| WarOnPrivacy wrote:
| > _A Few Good Zeroes: In Q1 2024, three drive models had zero
| failures_
|
| They go on to list 3 Seagate models that share one common factor:
| Sharply lower drive counts. Backblaze had a lot fewer of these
| drives.
|
| _All of their <5 failures_ are from low quantity drives.
|
| I have confidence in the rest of their report - but not with the
| inference that those 3 Seagate models are more reliable.
| matmatmatmat wrote:
| This uncertainty should be accounted for in the confidence
| intervals of their stats.
| dehrmann wrote:
| I find the stats interesting, but it's hard to actually inform
| any decisions because by the time the stats come out, who knows
| what's actually shipping.
| gangstead wrote:
| I wonder how the pricing works out. I look at the failure rates
| and my general take away is "buy Western Digital" for my qty 1
| purchases. But if you look within a category, say 14TB drives,
| they've purchased 4 times as many Toshiba drives as WD. Are the
| vendors pricing these such that it's worth a slightly higher
| failure rate to get the $/TB down?
| mijamo wrote:
| If you are a large company owning hundreds of thousands of them
| and knowing you will have disk failures regardless, maybe. If
| you own just a few hundreds and a failure costs you money the
| logic may be completely different.
| Marsymars wrote:
| I'd assume so. Also consider that if a drive fails under
| warranty, and you're already dealing with a bunch of failing
| drives on a regular basis, the marginal cost to get a warranty
| replacement is close to zero.
| Whatarethese wrote:
| And people will still say they dont trust Seagate because of the
| 3TB drives that failed over a decade ago.
| kstrauser wrote:
| Anecdata is such a weird thing. In my own NAS, I've had 3 out
| of 3 WD Red drives, each a different size, die in an identical
| manner well before their warranty expired over the last several
| years. SMART says everything is fine, but the drive's
| utilization creeps up to a constant 100% and its write IOPS
| decrease until the whole array is slow as frozen molasses.
| That's in a constantly comfortable operating environment that's
| never too hot, cold, or otherwise challenging. And yet it looks
| like I'm the statistical outlier. Other people -- like
| Backblaze here -- have decent luck with the same drives that
| have a 100% failure rate here.
|
| Probability is a strange thing, yo. The odds of a specific
| person winning the lottery are effectively 0, but someone's
| going to. Looks like I've won the "WD means Waiting Death"
| sweepstakes.
| freedomben wrote:
| Indeed, and anecdata is weighted so heavily by our minds,
| even when we are aware of it and consciously look at the
| numbers. That's what evolution gives us though. The best
| brains at survival are the ones that learned from their
| observations, so we're battling our nature by trying to
| disregard that. I'll never buy another Seagate because of
| that one piece of shit I got :-D
| vel0city wrote:
| Sounds like you're a victim of WD selling Reds with Shingled
| Magnetic Recording (SMR). Quite a scandal a few years ago.
|
| SMR takes advantage of the fact read heads are often smaller
| than write heads, so it "shingles" the tracks to get better
| density. However, if you need to rewrite in between tracks
| that are full, you need to shuffle the data around so it can
| re-shingle the tracks. This means as your array gets full or
| even just fragmented, your drives can start to need to
| shuffle data all over the place to rewrite a random sector.
| This does hell to drives in an array, which a lot of
| controllers have no knowledge of this shingling behavior.
|
| Shingled drives are OK when you're just constantly writing a
| stream of data and not going to do a lot of rewriting of data
| in betweeen. Think security cameras and database backups and
| what not. They're complete hell if you're doing lots of
| random files that get a lot of modifications.
|
| https://www.servethehome.com/wd-red-smr-vs-cmr-tested-
| avoid-...
| kstrauser wrote:
| No, these were 100% CMR drives. I checked them very closely
| when the scandal broke and confirmed that mine were not
| shingled.
| vel0city wrote:
| Huh, weird, because that's 100% the failure mode friends
| of mine who _did_ have shingled drives experienced. Maybe
| your drives were shingled despite labeling suggesting
| otherwise, or maybe they had whatever potential different
| error you got without it being the SMR that killed the
| arrays in the end.
|
| Either way it made me never want to use WD for drives in
| arrays and not trust their labeling anymore. "WD Red"
| drives lost all meaning to me; who knows what they're
| doing inside.
| kstrauser wrote:
| > Maybe your drives were shingled despite labeling
| suggesting otherwise
|
| I'm not ruling that out. The whole debacle was so
| amazingly tonedeaf that I wouldn't be surprised if they
| did that behind the scenes. I wrote this at the time:
| https://honeypot.net/2020/04/15/staying-away-from.html
| redox99 wrote:
| I've had so many Seagate drives fail that I won't buy Seagate
| again.
|
| If a brand sells bad drives, they should be aware of the
| reputational damage it causes. Otherwise there is no downside
| to selling bad drives.
| pcurve wrote:
| Looks like WDC reliability has improved a lot in the past decade.
|
| Seagate continues to trail behind competitors.
|
| I guess they're basically competing on price? Because with data
| like this, I don't know why anyone running data center would buy
| Seagate over WD?
| formerly_proven wrote:
| The WDC models which are only somewhat more expensive than
| Toshiba or Seagate tend to perform quite a lot worse than
| those. Models with the same performance are significantly more
| expensive.
| louwrentius wrote:
| If you buy drives based on there reports, make sure your drives
| are operating within the same environmental parameters or these
| stats may not apply
| fencepost wrote:
| As with every time these come out, _Remember that Backblaze 's
| usage pattern is different from yours!_
|
| Well, unless you're putting large numbers of consumer SATA drives
| into massive storage arrays with proper power and cooling in a
| data center.
___________________________________________________________________
(page generated 2024-05-02 23:01 UTC)