[HN Gopher] Backblaze Drive Stats for 2022
___________________________________________________________________
Backblaze Drive Stats for 2022
Author : TangerineDream
Score : 175 points
Date : 2023-01-31 14:13 UTC (8 hours ago)
(HTM) web link (www.backblaze.com)
(TXT) w3m dump (www.backblaze.com)
| controversial97 wrote:
| Backblaze has stated in a blog post that they purchase drives in
| large quantities direct from manufacturers and negotiate for good
| prices. Backblaze says that a model of drive which is
| significantly cheaper and somewhat less reliable can be ok for
| them.
|
| This means that these stats may not be a very meaningful
| indication when purchasing a few drives from a retailer.
|
| Never the less, the world is full of situations where you have to
| make choices with insufficient data. I'm still going to prefer to
| avoid Seagate hard drives where practical.
| itchyouch wrote:
| Even when they were shucking retail drives in the 1-4TB era,
| Seagate didn't have the greatest stats, with about a 5-30%
| failure rate (IIRC, over ~3 years or so), especially their 3T
| drives.
|
| I ran an array with 1.5T and 3T Seagates and within a 1-3
| years, I replaced at least 1/3 of my Seagates, but I made sure
| to replace the Seagates with Hitachis and Western Digitals,
| even though they were slightly more expensive.
| metadat wrote:
| 100% with you.
|
| Seagate is the manufacturer with drives that fail at 10x the
| rate of the competition. I'll avoid, because drive failures are
| annoying.
| neilv wrote:
| > _they purchase drives in large quantities direct from
| manufacturers_
|
| Given that the manufacturer knows that Backblaze publishes its
| influential drive stats... do we know that the drive units that
| the manufacturer ships to Backblaze are representative of the
| quality of the same models available at retail?
| zh3 wrote:
| If we assume that Backblaze gets special treatment from every
| manufactured who supplies it, that would suggest we're seeing
| stats that are the best the manufacturer can manage. Which
| seems a reasonable proxy for what they sell (lots of
| statistical quibbling about this view aside).
| nekoashide wrote:
| Any more special treatment than they give AWS, GCP or
| Azure? I feel they just manufacture them all the same
| because there's no point in doing anyone any favors.
| didgetmaster wrote:
| I don't think anyone imagines that the hard drive
| manufacturers are making special drives for Backblaze.
| The question is 'Are they getting the cream of the crop?'
| out of all the drives that are manufactured.
|
| Every drive must go through a series of tests before it
| is judged suitable for sale to the consumer. Drives that
| fall below a certain threshold are judged 'failed' but
| there can be a variety of drives that fall within the
| 'passed' range. Some are just barely above the threshold,
| others are way above.
|
| It is kind of like binning for CPUs. The very best ones
| can have their base clocks increased and sold at a
| premium. The same kind of differentiation could be done
| for hard drives.
| brianwski wrote:
| Disclaimer: I work at Backblaze.
|
| > Do we know if these drives are the same as you would
| purchase in a retail outlet
|
| Seagate (for instance) won't sell ANYBODY drives directly, so
| they force us to get bids from various resellers and
| distributors. So when we pick the lowest price, I don't think
| Seagate knows who the drives are going to, but there might be
| a trick in there somewhere I am not aware of.
| toomuchtodo wrote:
| Have you considered becoming a distributor yourself? I
| would assume at a certain purchasing scale, the cost
| benefit is apparent, but perhaps not.
| Kye wrote:
| This is why every major tech company eventually becomes a
| domain registrar.
| jjeaff wrote:
| Do all major tech companies really register that many
| domains? I assumed that becoming a registrar was more
| about pushing your control up the pipeline to reduce risk
| if things like losing your domain name was the main
| reason. Rather than saving money on registrations.
| tjoff wrote:
| The implication being that they might knowingly buy a subpar
| batch or that a manufacturer would offer rebates for a subpar
| batch, or what?
| Macha wrote:
| If drive A has a 3% failure/year rate and drive B a 1%
| failure/year rate, but you can get 12 of drive A for the
| price of 10 drive B and you're going to have stock on hand to
| replace drives anyway and you're confidant in your backup
| strategy then for a business it may still make sense to buy
| drive A.
|
| As a consumer, you're unlikely to keep an in-box spare, so
| even if you have a good backup system, a drive failure is
| still going to be an inconvenience, and leave you without
| redundancy until your replacement is shipped to you, so you
| might pay 20% more for drive A.
| tjoff wrote:
| I don't follow. The statistics backblaze post doesn't
| implicate that you should buy the same drives they do.
|
| It is because of theses statistics that you have (at least
| an inkling of) a fighting chance to pick the drive that
| more suits your usecase. Or perhaps a notion of brand
| performance.
|
| Of course, you won't get statistics about drives that they
| don't buy. And if you then assume that more expensive
| drives have better failure rates then that might skew the
| data. But in there lies a lot of assumptions you really
| can't make. For one, that an expensive drive for a consumer
| is also an expensive for backblaze. And also, that there is
| correlation between end user price and failure rates.
|
| Which generally I don't really think there is, within the
| alternatives that the consumer have access to (and
| reasonable price-ranges) at least. And it probably will
| vary between different markets as well.
| Macha wrote:
| Backblaze's article points out that sometimes they will
| buy drives that are less reliable but cheaper, as that
| makes sense for their business model.
|
| The top level commenter points out that this strategy may
| not make as much sense for the individual consumer where
| you don't have backblaze levels of drives to amortise the
| failure probability and deal with it.
|
| That's it, that's the point. It's not a rebuttal of the
| backblaze post at all, nor is it intended to be. The top
| level commenter was simply pointing out "don't buy mostly
| seagate drives because backblaze does" as backblaze
| explained their strategy in drive selection and some
| people may not consider the differences between their use
| case and backblaze's.
| tjoff wrote:
| The whole point of the statistics is that you can pick
| what you want. Why would you dismiss the statistics and
| buy what backblaze does? Defeats the entire point of the
| statistics and article in the first palce.
|
| Top level comment: > _This means that these stats may not
| be a very meaningful indication when purchasing a few
| drives from a retailer._
|
| That doesn't follow.
| Dalewyn wrote:
| >Why would you dismiss the statistics and buy what
| backblaze does?
|
| You assume people read the fine print and consider fine
| details.
|
| The vast majority of Joes are going to ask who/what
| Backblaze is and what they're buying most and buy that,
| screw the details.
| mardifoufs wrote:
| Not sure if that's what OP meant, but it could be that they
| get better batches from manufacturers, while retail gets the
| rest?
|
| The other way you could read the comment would be that the
| stats don't mean much since Backblaze can tolerate higher
| defect rates if they negotiate a low enough price. But to me,
| that doesn't make sense since it wouldn't affect the failure
| rate statistics at all anyway.
| iot_devs wrote:
| I really struggle to follow this logic.
|
| You could argue to remove the driver that are represented less
| than 100 times (and it will be a stretch).
|
| But once you have so many samples, there is no reason to not
| believe that your driver will not follow the same distribution.
| cm2187 wrote:
| You converge to the mean fairly quickly. Build a NAS of 12
| drives and those stats become meaningful.
|
| That being said with 50TB drives just around the corner [1] and
| SSD caching becoming the norm, 12 bays consumer NAS will become
| rare.
|
| [1] https://www.anandtech.com/show/18733/seagate-
| confirms-30tb-h...
| aidenn0 wrote:
| At what point to increased rebuild times mean that the
| increased redundancy requirements overwhelm the increased
| capacity? If you're running large parity groups, the
| redundancy required is rather small, but when you move to
| smaller raid groups allowing for a second (or third) drive
| failure significantly impacts total storage.
| cm2187 wrote:
| Agree, but that's what backups are for. RAID5 isn't a
| backup, it's a way to stay online during the rebuild. If
| you are using RAID5 but don't backup your files, you
| probably don't care about those files that much.
| tux3 wrote:
| RAID or not, if you lose an active drive you will need to
| 1) restore from backup, 2) fill a new backup drive
|
| If it takes days to fill your new 50 TB backup drive, you
| have the same problem RAID5 has with rebuild times. The
| drive might fail in the middle.
|
| RAID isn't a backup mostly because if you overwrite RAID
| data, you wrote over every copy at once. Non RAID offsite
| backups don't solve any problem related to drive size,
| they just make it a lot less likely for a single event to
| take everything out in the same minute.
| cm2187 wrote:
| Absolutely, you want RAID + backup, not one or the other.
|
| 50TB at 200MB/s is about 72h. Doesn't seem to be a
| particularly problematic rebuild time (and that's
| assuming 100% filled). Of course you need to do regular
| data scrubbing. If your rebuild is the first time the
| data is being read in 5y, that might not go so well.
| sn0wf1re wrote:
| > 50TB at 200MB/s is about 72h.
|
| Assuming the device is offline for users during that
| time, otherwise there may be reduced throughput.
| sliken wrote:
| > RAID or not, if you lose an active drive you will need
| to 1) restore from backup
|
| Er what? I've had production machines lose a drive, have
| it replaced, and never leave production. Why would you
| need to restore from backups?
| ThePowerOfFuet wrote:
| > RAID or not, if you lose an active drive you will need
| to 1) restore from backup
|
| _RAID 1 joins the chat_
| aidenn0 wrote:
| If you want <X% chance of having to restore from backups,
| and rebuild times go up, you'll need to switch from RAID5
| to RAID6 in order to maintain that.
| Teever wrote:
| Why do you think that 50TB drives will make 12 bay NAS become
| rare?
| Hamuko wrote:
| We'll just hoard bigger files, like thus far.
|
| My iPhone shoots 90 megabyte photos.
| cm2187 wrote:
| I don't know, my own experience is that drive capacity
| increase exponentially but my storage need are fairly
| linear, bigger files but not that many more bigger files.
| sn0wf1re wrote:
| I've noticed that media files seem to look "good enough"
| at 720p for cartoons and 1080p for live action. I know
| some people obsess over the highest quality of everything
| (4k 10bit HDR) but even those came out in 2016[1] and
| aren't that popular. Higher definition like 8k is
| basically non existent, it can't run on old HDMI cables
| and needs both a beefy decoder (Blasting heat) and a
| powerful display driver as well.
|
| [1] https://en.wikipedia.org/wiki/Ultra_HD_Blu-ray
| crazygringo wrote:
| > _Higher definition like 8k is basically non existent_
|
| You're forgetting about VR. 8K in VR is very popular now.
| If mainstream 2D HD is 720p and 1080p, the approximate
| mainstream quality equivalents in 3D VR are 5K and 8K.
| And you can watch it on a cheap Quest 2 headset.
| tke248 wrote:
| I have a theory that solar flares are the cause of some hard
| drive failures would be interesting to see if a few lead shielded
| cases would reduce the number of failures. I used to manage a
| large fleet of computers and anytime we got radio interference
| from solar flares we would have 3-5 hard drive failures that day.
| rom-antics wrote:
| What were the failure modes? Was it corrupted data or were the
| drives permanently fried?
| tke248 wrote:
| Corrupt data that would compromise the Operating systems,
| these were Dell computers with multiple different branded
| hard drives they would have us run their hardware diagnostic
| tool that would put them in the range to receive a free
| replacements. We didn't have to send back the old ones under
| the contract we had with them they would still work when
| reformatted but were less reliable after that.
| maccam94 wrote:
| If you didn't disable the write cache on those drives,
| flares could have caused bit flips in the cache memory
| before it was flushed to disk.
| GTP wrote:
| So far I always assumed that, when talking about HDDs
| failing rates, they where considering the typical
| mechanical failure. I never considered that they could
| declare a failure due to some corrupted data, although it
| would be reasonable for a datacenter to do so.
| tke248 wrote:
| Mechanical failures were more prevalent in my experience
| in 90s, most of the recent stuff is usually see are
| controller failures. I rarely hear any head crash
| clicking like the old days
| dekhn wrote:
| Any sufficiently large cluster is effectively a cosmic ray
| detector with terrible sensitivity.
| lazide wrote:
| Out of what fleet size?
| tke248 wrote:
| Around a 1000
| ck2 wrote:
| I don't trust any of these helium filled drives to last more than
| a decade.
|
| Helium atom is too small and leaks through everything,
| eventually.
|
| Really hope I am proven wrong.
|
| Unfortunately what backblaze is excellently documenting is not
| archival use.
|
| I've got half TB and 1TB WDC drives that are over a decade old
| and still spin up fine, single platter and run cool and quiet
| even air filled.
|
| I think 4TB is the cutoff for air-filled but not sure anymore.
| tzs wrote:
| > Helium atom is too small and leaks through everything,
| eventually.
|
| Suppose the drive were encased in solid hydrogen? Hydrogen
| freezes at 14 K and helium boils at 4 K so there's a 10 K range
| where you could have both solid hydrogen and gaseous helium.
|
| Hydrogen atoms are bigger than helium atoms, but what matters
| is the gaps between the hydrogen in the solid. I was unable
| with a bit of Googling to find out anything about the physical
| structure of solid hydrogen.
| geerlingguy wrote:
| Many 8 TB and even up some 12 TB I believe don't have helium.
| It depends on the model and manufacturer, so you have to dig
| into the spec sheets to figure it out.
| aequitas wrote:
| But is the helium really required for the drive to function or
| is it just an efficiency thing? So that if all the helium leaks
| out the drive would just be slower or consume more power?
| dannyw wrote:
| There are helium sensors in the drives. It's reported via
| SMART. It's monitorable.
| dehrmann wrote:
| I don't trust any spinning rust for a decade, but I suppose a
| pre-helium model stored in good, stable conditions should still
| spin up in a decade.
| RDaneel0livaw wrote:
| Looks like HGST continues to rule the pack, similar to prior
| years. Though, I thought the entire brand of HGST was set to be
| discontinued and Western Digital just taking over all naming /
| marketing / etc... I even see when I search for "ultrastar" or
| similar, it's all currently branded Western Digital. Does
| backblaze get some special enterprise gear?
| deeesstoronto wrote:
| Western Digital got most of HGST - however HGST's 3.5in drive
| business was sold to Toshiba. The HGST branding went to WD.
|
| I believe hese new HGST drives are WDs.
|
| The HGST drives in the older backblaze stats (which showed good
| reliability) continue to be manufactured by/evolved into
| Toshibas. There are several HGST model numbers that remained in
| Toshiba's lineup.
|
| https://www.anandtech.com/show/5635/western-digital-to-sell-...
| segfaultbuserr wrote:
| Some (all?) rebranded "WD Ultrastar" drives still report
| themselves as "HGST" as their SMART device models. I'm not sure
| how does Blackblaze count them. Perhaps they're (correctly)
| counted via the SMART model regardless of branding?
|
| Update: This issue has been covered in the Blackblaze 2020
| report. They apparently existed in parallel. The original HGST
| drives, while still existed as of 2020, were being gradually
| phased out...
|
| > _These drives obviously share their lineage with the HGST
| drives, but they report their manufacturer as WDC versus HGST.
| The model numbers are similar with the first three characters
| changing from HUH to WUH and the last three characters changing
| from 604, for example, to 6L4. We don't know the significance
| of that change, perhaps it is the factory location, a firmware
| version, or some other designation. If you know, let everyone
| know in the comments. As with all of the major drive
| manufacturers, the model number carries patterned information
| relating to each drive model and is not randomly generated, so
| the 6L4 string would appear to mean something useful._
| andrewmunsell wrote:
| I tend to buy my HDDs from Provantage now-a-days. I've never
| bought an HGST drive (I have a combo of WD and Toshiba drives),
| but the HGST branding seems more prominent at Provantage
| (https://www.provantage.com/hgst-0f30146~7HGST0H9.htm) versus
| elsewhere (ex. Amazon: https://www.amazon.com/HGST-Ultrastar-
| HUH721212ALE600-3-5-In..., though I'd never buy an HDD from
| Amazon today due to their shipping issues)
| dspillett wrote:
| _> though I 'd never buy an HDD from Amazon today due to
| their shipping issues_
|
| I'm wary of Amazon due to co-mingling issues. I can't be sure
| I'm not getting something incorrectly packaged/labelled by
| another seller so I get a reconditioned (or simply
| counterfeit) unit instead of a new one.
| numpad0 wrote:
| I'm a total consumer in this topic, but I believe WD tried to
| and externally communicated that to be the plan, and drives are
| marked as such, but due to antitrust rulings(?) HGST remains
| somewhat independent and the heritages remain somewhat separate
| to each others. Also, there seems to be just handful of
| "actual" drive models in HDD market - they seem to be swapped
| Ladas and VW Beetles. Perhaps that makes calling them in old
| names make sense.
| bluedino wrote:
| I wonder what the numbers need to look like to replace the
| remaining 4TB pods with 16TB pods.
| atYevP wrote:
| Yev from Backblaze here -> That's a factor of time and power!
| The 16TB drives consume more power so that's part of the
| calculation when we swap from the 4TBs, but we are monitoring
| them b/c we certainly want to avoid cluster failures and have
| what we call "rolling migration" projects going on all the time
| where we perpetually migration hard drives and hardware based
| on durability projections, power balancing, physical capacity,
| etc...
| 2c2c2c wrote:
| Anyone have experience using their s3 drop in replacement at
| scale?
| hedora wrote:
| I have single digit TB in it, and my synology HyperBackups fail
| extremely infrequently. (I'm not willing to attribute the
| failures to their S3 implementation, since our ISP kind of
| sucks.)
|
| Anyway, I haven't seen any apparent data loss over the last few
| years. I'm considering doing a full restore, just for the heck
| of it. (Especially since this article convinced me I should
| start replacing the aging drives in my NAS!)
| a10c wrote:
| HyperBackup is extremely disappointing.
|
| At no point has it risen above 1MB/s when backing up to S3
| (Backblaze) when other methods routinely saturate my upload
| (40Mb/s)
| dylan604 wrote:
| My only experience was moving data from a b2 to an s3. While
| there were some inefficiencies with the workflow, it went as
| well as could be hoped for when cross chatting between cloud
| vendors.
|
| Installing their linux app was out of bounds of normal
| procedures, but it be what it b2 <hangsHeadInShame>
| james_in_the_uk wrote:
| b2 or not b2, is that the question?
| jms703 wrote:
| If you're struggling to pick a drive, maybe the best approach is
| to find a drive that meets your needs from a maker that has a
| solid warranty program. With a good backup strategy, the maker of
| the drive may not matter as much.
| Hamuko wrote:
| I'll never be able to understand how Seagate manages to make so
| many failing hard drives. A 5.7% failure rate with an average age
| of 2 years? The only one that comes close is the 5.3% HGST with
| over twice the average age and overall low sample size.
|
| Out of the many hard drives that I've owned in my life, the only
| ones to die on me have both been Seagates, with the first one
| being the infamous ST3000DM001, a hard drive so shit it has its
| own Wikipedia article
| (https://en.wikipedia.org/wiki/ST3000DM001).
| the_third_wave wrote:
| Either you're lucky or you just have not had that many drives
| around...
|
| The first to fail was a Miniscribe 51/4" 20MB, it died after a
| week. I replaced it with a 31/2" Seagate connected to an
| Adaptec RLL controller - this being before IDE was a thing the
| controller decided the encoding scheme and RLL gave you 50%
| extra storage space. This Seagate never died on me, I probably
| have it around somewhere still. Then came the Western Digital
| IDE drives, two of those died and took ~250MB of data with them
| to data heaven. They were followed by another WD, this one
| 1.2GB - it died. When I built a system around the famous Abit
| BP-6 motherboard I put in two Maxtor drives - which I should
| not have done, one of them died within 2 years. Meanwhile I'd
| helped my father get a new drive, one of those fancy IBM
| Deskstars. Within a year it had turned itself into a Deathstar
| like most of its brethren seem to have done. Then there was the
| 10 GB Travelstar which died, then the 20GB Travelstar which
| also died - but survived the canoe expedition over the Yukon
| safely tucked away into the water- and probably bulletproof
| solar-powered Virgin Webplayer I had made to document the 31/2
| month trip - and the 20GB Toshiba. The 2TB WD Green, dead. One
| of the 1TB WD Greens, dead - but its mate still running strong
| at a 120.000+ running hours with its 2TB brand mates coming in
| second at 100.000+ hours. In the DS4243 array I have replaced 5
| 15K 600GB drives, good that these were cheap as dirt (but take
| quite a bit of power to run, hence the low price).
|
| Notice that I have not had a Seagate drive die on me yet. There
| are a few in an array somewhere here but those still do their
| job even though they're quite old by now. Maybe I'm just lucky
| in that I never bought any of the failing types since these
| problems seem to be related to specific types.
| pooloo wrote:
| They also have a drive on that list that has an Avg Age of 92
| months, also they have significantly more Seagate drives
| compared to the rest. Additionally, most of these drives are
| consumer drives which are not intended to be ran at 100% for
| months on end.
| paulmd wrote:
| having high failure rates _despite_ having lots of drives is
| actually worse - there is the chance the HGST failures are
| just random chance due to small numbers of drives. But this
| possibility does not exist for Seagate with much larger
| statistical samples.
|
| AFAIK there hasn't been much of a difference shown between
| consumer workloads and enterprise workloads for HDD lifespan
| either, it's just cope and theorycrafting from people who are
| emotionally attached to the idea of Seagate not being shit
| for some reason.
|
| No other drives seem to have such problems with being used in
| this fashion: what is your theory for why Seagate drives are
| _uniquely_ affected by being in the pods in some fashion that
| would not also affect WD drives or HGST drives or whoever
| else? Are WD drives not affected by vibration for some
| reason? For a while Seagate Deniers latched onto the first-
| gen pods as maybe being the answer but they 're all long gone
| at this point, this failure-rate anomaly is continuing even
| in the newer pods.
|
| The only reasonable possibility would be that Seagate drives
| are simply constructed in an entirely different, less
| resilient fashion, which (a) is not factually supported in
| any way afaik, and (b) would still be very relevant for
| consumers to know! It's not like a home PC is vibration-free
| either after all.
|
| There comes a point when it's not "steelmanning" it's just
| denial of reality in the face of consistent evidence. Like
| you're not "steelmanning" climate change you're just a
| denier.
|
| The data has pretty consistently showed the same thing for
| 10+ years. It's not a "random sampling bias" that uniformly
| affects everyone except Seagate in the exact same way almost
| every single survey, it's not some magical factor that makes
| Seagate drives uniquely unsuited to storage-array usage but
| magically resilient when used in a home PC, it's not first-
| gen backblaze pods being bad, it's just Seagate putting out
| shitty drives, period the end. All these extremely complex
| theories to get around the very simple conclusion that
| Seagate has shitty parts or shitty QC and the failure rates
| are slightly higher as a result.
|
| A lot of the Seagate models are relatively OK, but almost all
| of the "outlier" drives with really high failure rates are
| Seagate. It is the old bayesian probability thing: get rid of
| Seagate and you've gotten rid of almost all of the models
| with high failure rates.
|
| edit: sorry Samsung on the brain since they have another wave
| of SSD failures too /laugh
___________________________________________________________________
(page generated 2023-01-31 23:02 UTC)