[HN Gopher] Backblaze Drive Stats for 2024
___________________________________________________________________
Backblaze Drive Stats for 2024
Author : TangerineDream
Score : 604 points
Date : 2025-02-11 14:55 UTC (1 days ago)
(HTM) web link (www.backblaze.com)
(TXT) w3m dump (www.backblaze.com)
| comboy wrote:
| Huh, what happened to HGST?
| antithesis-nl wrote:
| They, ironically, got acquired by Western Digital. But the
| 'Ultrastar' line name is still alive, if that's what you're
| looking for. 'Deskstar' seems to be gone, though.
| betaby wrote:
| From that table I understand that WD are the most reliable
| nowadays. Especially 16TB models. Is my understanding
| correct?
| antithesis-nl wrote:
| It's a wash. Modern mechanical HDDs are so reliable that
| the vendor basically doesn't matter. Especially if you
| stick with 'Enterprise'-tier drives (preferably with a SAS
| interface), you should be good.
|
| Aside from some mishaps (that don't necessarily impact
| reliability) with vendors failing to disclose the HAMR
| nature of some consumer HDDs, I don't think there have been
| any truly disastrous series in the past 10-15 years or so.
|
| You're more likely to get bitten by supply-chain
| substitutions (and get used drives instead of new ones)
| these days, even though that won't necessarily lead to data
| loss.
| userbinator wrote:
| _I don 't think there have been any truly disastrous
| series in the past 10-15 years or so_
|
| ST3000DM001.
| nubinetwork wrote:
| > 'Deskstar' seems to be gone, though.
|
| Considering we used to call them deathstars, I'm surprised
| they didn't get rid of the line sooner...
| LargoLasskhyfv wrote:
| I still have some of those in working condition, somewhere.
|
| If I can be bothered to power on the antiques, which they
| are built in.
|
| Which is rare, but it happens.
|
| I abused them really hard when they weren't antique.
|
| Still working every time, so far...
| neilv wrote:
| Every year, this seems like great brand promotion for Backblaze,
| to technical prospective customers, and a nice service to the
| field.
|
| What are some other examples from other companies of this,
| besides open source code?
| ajross wrote:
| Benson Leung's USB cable crusade comes to mind. Also Jim
| Gettys' coming out of seeming retirement to educate us all
| about Bufferbloat.
| bityard wrote:
| This is called "content marketing" and there are usually at
| least a handful of them on the HN front page at any given time.
|
| Although I will say that the BackBlaze drive stats articles are
| a much higher effort and standard of quality than you typically
| see for this tactic.
| CTDOCodebases wrote:
| It's worth noting that this type of marketing can also
| improves page rankings.
| devrand wrote:
| Puget Systems has similar publications covering their
| experience building client systems, though not always in the
| same level of detail. They also have PugetBench to benchmark
| systems in real-world applications/workflows.
| samch wrote:
| A company called TechEmpower used to run periodic web framework
| benchmarks and share out the results using nice dashboard. Not
| sure why they stopped doing these.
|
| https://www.techempower.com/benchmarks/#hw=ph&test=fortune&s...
|
| Edit: Adding a shoutout to the iFixIt teardowns that are also
| quite informative content:
|
| https://www.ifixit.com/Teardown
|
| Edit 2: Also Lumafield CT scans:
|
| https://www.scanofthemonth.com/
| KomoD wrote:
| TechEmpower still does them. https://github.com/TechEmpower/F
| rameworkBenchmarks/issues/95...
| nerdponx wrote:
| Jepsen (of database benchmark fame) does paid consulting work.
| zX41ZdbW wrote:
| Some examples from me:
|
| Database benchmarks: https://github.com/ClickHouse/ClickBench
| (containing 30+ databases) and the new JSON analytics
| benchmark, https://github.com/ClickHouse/JSONBench/
|
| Plus, the hardware benchmark:
| https://benchmark.clickhouse.com/hardware/ (also used by
| Phoronix).
| DecentShoes wrote:
| ...Spotify wrapped kinda?
| atYevP wrote:
| Yev from Backblaze here -> When we started this, we did so with
| the intent of sharing data and hoping that others would do the
| same. We see glimmers of that here and there but it's still so
| fun for us to do and we're expanding on it with Networking
| Stats and some additional content that's going to be coming
| soon including how these inform our infrastructure deployment.
| Fun stuff :)
| theandrewbailey wrote:
| > I have been authoring the various Drive Stats reports for the
| past ten years and this will be my last one. I am retiring, or
| perhaps in Drive Stats vernacular, it would be "migrating."
|
| Thank you for all these reports over the years.
| ganoushoreilly wrote:
| They really have been great, the bar was set high!
| fyrabanks wrote:
| I almost cannot believe I've been reading these for 10 years
| now.
| mhh__ wrote:
| So long and thanks for all the disks!
| atYevP wrote:
| Yev here -> That's what I said!!! Great minds :P
| stego-tech wrote:
| Seriously. In addition to helping inform my purchasing
| decisions, these reports also taught me the most valuable
| lesson of data: that it can only ever inform most likely
| patterns, but never guarantee a specific outcome. Until you act
| upon it, the data has no intrinsic value, and once you act upon
| it, it cannot guarantee the outcome you desire.
|
| Thanks for a decade of amazing statistics and lessons. Enjoy
| resilvering into retirement!
| ddmf wrote:
| They've been very informative, thanks.
| gosub100 wrote:
| Thanks for dutifully providing us this info on (many a) silver
| platter.
| Mistletoe wrote:
| Remember when Seagate sponsored the /r/datahoarder subreddit
| instead of making better hard drives?
| gruez wrote:
| Source?
| KomoD wrote:
| Nope
| louwrentius wrote:
| When I started my current 24-bay NAS more than 10 years ago, I
| specifically looked at the Backblaze drive stats (which were a
| new thing at that time) to determine which drives to buy (I chose
| 4TB 7200rpm HGST drives).
|
| My Louwrentius stats are: zero drive failures over 10+ years.
|
| Meanwhile, the author (Andy Klein) of Backblaze Drive Stats
| mentions he is retiring, I wish him well and thanks!
|
| PS. The data on my 24-drive NAS would fit on two modern 32TB
| drives. Crazy.
| sys32768 wrote:
| I had five Seagates fail in my Synology NAS in less than a year.
| Somebody suggested it was a "bad" firmware on that model, but I
| switched to WD and haven't had a single failure since.
| BonoboIO wrote:
| Exos Series?
|
| I never had problems with Seagate Exos or WD Red or even the WD
| shucked White Reds.
|
| It's interesting how different the experiences are, some swear
| by a specific brand.
| buckle8017 wrote:
| Unfortunately using all the same type of drive in any kinda of
| system is a recipe for disaster.
|
| Incompatibilities between the drive firmware and the device
| they're in can cause problems.
|
| Subtle harmonic issues with how the drives are mounted, which
| might be fine for some drives and disastrous for others.
|
| I've always found the best strategy with mechanical hard drives
| is to have various brands and models in the same device on
| RAID.
| zie wrote:
| This. I don't care about brand or model or anything. I care
| about interface/speed requirements and then $/size.
|
| Drives are interchangeable for a reason. :)
| ganoushoreilly wrote:
| Did you purchase them all at the same time from the same store?
| I've had a batch of SSDs fail from the same vendor / mfg
| timeframe. I started ordering a couple here and there form
| different vendors where possible. So far i've been lucky to get
| drives that aren't from the same batches. I tend to buy Exos
| from seagate and WD gold though so there's a bit of a premium
| tacked on.
| sys32768 wrote:
| No, that's the weird thing. Even the RMA models were failing.
| But sure enough, it wasn't just some incompatibility with the
| NAS because I tested them on PCs to confirm they were
| failing, and they were.
| Bayaz wrote:
| I had a similar experience. I ordered four EXOS drives
| three years ago and one of them came DOA. They had to send
| me three more drives before I got a working one. I'm amazed
| they're all still happily humming away in a Synology.
| emmelaich wrote:
| What models? There's a big difference between the cheapest and
| the more pro models.
|
| That said, my four 2Tb Barracudas still going fine after many
| years (10+). One failed, replaced with a green. Big mistake,
| that failed quickly and I went back to standard Barracudas.
|
| They don't get used intensely though.
| sys32768 wrote:
| 8TB Ironwolf NAS ST8000VN004
| esskay wrote:
| I've had terrible luck with those drives, out of 12, 10
| failed within a couple of years. Not a same batch issue as
| they were purchased over the period of about 6-8 months and
| not even from the same place.
|
| Yet I've got Toshibas that run hot and are loud as heck
| that seem to keep going forever.
| KPGv2 wrote:
| This will probably jinx me, but I've had so many drives, many
| purchased on the cheap from Fry's Black Friday sales when I was
| a poor university student, and the two drives I've ever had
| fail since I started buying over twenty years ago were
|
| 1. catastrophic flood in my apartment when drive was on the
| ground
|
| 2. a drive in an external enclosure on the ground that I kicked
| by mistake while it was spinning
|
| I'm glad I've never had y'all's problems.
| jdhawk wrote:
| I wish there was a way to underspin (RPM) some of these drives to
| lower noise for non-datacenter use - the quest for the Largest
| "Quiet" drive - is a hard one. It would be cool if these could
| downshift into <5000RPM mode and run much quieter.
| zootboy wrote:
| I wonder if that's even technically possible these days. Given
| the fact that the heads have to float on the moving air (or
| helium) produced by the spinning platter, coupled with modern
| data densities probably making the float distance tolerance
| quite small, there might be a very narrow band of rotation
| speeds that the heads require to correctly operate.
| jdhawk wrote:
| yeah - valid point. it seems like they all moved past 5400RPM
| at the 14TB level.
| ecliptik wrote:
| It's not a best practice, but the last 10 years I've run my home
| server with a smaller faster drive for the OS and a single larger
| disk for bulk storage that I choose using Backblaze Drive Stats.
| None of have failed yet (fingers-crossed). I really trust their
| methodology and it's an extremely valuable resource for me as a
| consumer.
|
| My most recent drive is a WDC WUH722222ALE6L4 22TiB, and looking
| at the stats (albeit only a few months of data), and overall
| trend of WDC, in this report gives me peace of mind that it
| should be fine for the next few years until it's time for the
| cycle to repeat.
| kridsdale1 wrote:
| No RAID 0 for the bulk storage? What's your disaster plan?
| ecliptik wrote:
| restic + rclone to cloud storage for data I care about, the
| majority of the data can easily be replaced if needed.
| manosyja wrote:
| That's exactly how I do it.
| SteveNuts wrote:
| Surely you mean RAID 1? Or 5, 6, 10 perhaps?
| BSDobelix wrote:
| Disaster-plan is always backup (away from location) or out-
| of-house replication, raid is NOT a backup but a part of a
| system to keep uptime high and hands-on low (like redundant
| power and supply)
|
| Disaster = Your DC or Cellar is flooded or burned down ;)
| qskousen wrote:
| I'm sure you're aware but consider putting another drive in for
| some flavor of RAID, it's a lot easier to rebuild a RAID than
| to rebuild data usually!
|
| Edit: By "some flavor" I mean hardware or software.
| walrus01 wrote:
| RAID doesn't cover all of the scenarios as offsite backup,
| such as massive electrical power surge, fire, flood, theft or
| other things causing total destruction of the RAID array.
| Ideally you'd want a setup that has local storage redundancy
| in some form of RAID _and_ offsite backup.
| bombcar wrote:
| In fact for home users backup is WAY more important than
| RAID, because your NAS down for a (restore time) is not
| that important, but data loss is forever.
| didntcheck wrote:
| For essential personal data you're right, but a very
| common use case for a home NAS is a media server. The
| library is usually non-essential data - _annoying_ to
| lose, but not critical. Combined with its large size, it
| 's usually hard to justify a full offsite backup. RAID
| offers a cost-effective way to give it _some_ protection,
| when the alternative is nothing
| walrus01 wrote:
| For a number of people I know, they don't do any offsite
| backup of their home media server. It would not result in
| any possibly-catastrophic personal financial
| hassles/struggles/real data loss if a bunch of movies and
| music disappeared overnight.
|
| The amount of personally generated sensitive data that
| doesn't fit on a laptop's onboard storage (which should
| all be backed up offsite as well) will usually fit on
| like a 12TB RAID-1 pair, which is easier to back up than
| 40TB+ of movies.
| t0mas88 wrote:
| Same here, I use raid 1 with offsite backups for my
| documents and things like family pictures. I don't backup
| downloaded or ripped movies and TV shows, just redownload
| or search for the bluray in the attic if needed.
| dharmab wrote:
| Having to restore my media server without a backup would
| cost me around a dozen hours of my time. 2 bucks a month
| to back up to Glacier with rclone's crypt backend is
| easily worth it.
| code_biologist wrote:
| How are you hitting that pricing? S3 "Glacier Deep
| Archive"?
|
| Standard S3 is $23/TB/mo. Backblaze B2 is $6/TB/mo. S3
| Glacier Instant or Flexible Retrieval is about $4/TB/mo.
| S3 Glacier Deep Archive is about $1/TB/mo.
|
| I take it you have ~2TB in deep archive? I have 5TB in
| Backblaze and I've been meaning to prune it way down.
|
| Edit: these are raw storage costs and I neglected
| transfer. Very curious as my sibling comment mentioned
| it.
| dharmab wrote:
| Yup, deep archive on <2TB, which is more content than
| most people watch in a lifetime. I mostly store content
| in 1080p as my vision is not good enough to notice the
| improvement at 4K.
| crazygringo wrote:
| Have you checked the costs for _restoring_ from Glacier?
|
| It's not the backing up part that's expensive.
|
| I would not be surprised if you decided to spend the
| dozen hours of your time after all.
| t0mas88 wrote:
| AWS Glacier removed the retrieval pricing issue for most
| configurations, but the bandwidth costs are still there.
| You pay $ 90 to retrieve 1 TB.
| dharmab wrote:
| The retrieval cost is less than 1 hour of my time and I
| expect less than 10% chance I'll ever need it.
| wing-_-nuts wrote:
| I think there's a very strong case to be made for
| breaking up your computing needs into separate devices
| that specialize in their respective niche. Last year I
| followed the 'PCMR' advice and dropped thousands of
| dollars on a beefy AI/ML/Gaming machine, and it's been
| great, but I'd be lying to you if I didn't admit that I'd
| have been better served taking that money and buying a
| lightweight laptop, a NAS, and gaming console. I'd have
| enough money left over to rent whatever I needed on
| runpod for AI/ML stuff.
| numpad0 wrote:
| That assumes disks never age out and arrays always
| rebuild fine. That's not guaranteed at all.
| ipsento606 wrote:
| for the home user backing up their own data, I honestly
| think that raid has limited utility.
|
| If I have 3 disks to devote to backup, I'd rather have 1
| local copy and two remote copies, vs 1 local copy with RAID
| and 1 remote copy without.
| dgemm wrote:
| It's super useful for maintenance, for example you can
| replace and upgrade the drives in place without
| reinstalling the system.
| t0mas88 wrote:
| If it's infrequently accessed data then yes, but for a
| machine that you use every day it's nice if things keep
| working after a failure and you only need to plug in a
| replacement disk. I use the same machine for data storage
| and for home automation for example.
|
| The third copy is in the cloud, write/append only. More
| work and bandwidth cost to restore, but it protects
| against malware or fire. So it's for a different
| (unlikely) scenario.
| gruez wrote:
| >It's not a best practice, but the last 10 years I've run my
| home server with a smaller faster drive for the OS and a single
| larger disk for bulk storage that I choose using Backblaze
| Drive Stats. None of have failed yet (fingers-crossed). I
| really trust their methodology and it's an extremely valuable
| resource for me as a consumer.
|
| I also have multiple drives in operation in the past decade and
| didn't experience any failures. However unlike you, I didn't
| use backblaze's drive stats to inform my purchase. I just
| bought whatever was cheapest, knowing that any TCO reduction
| from higher reliability (at best, around 10%) would eaten up by
| the lack of discounts the "best" drive. That's the problem with
| n=1 anecdotes. You don't know whether nothing bad happened
| because you followed "the right advice", or you just got lucky.
| eatbitseveryday wrote:
| > WDC WUH722222ALE6L4 22TiB
|
| Careful... that is 22 TB, not 22 TiB. Disk marketing still uses
| base 10. TiB is base 2.
|
| 22 TB = 20 TiB
| CTDOCodebases wrote:
| Take these stats with a grain of salt.
|
| I am becoming more and more convinced that hard drive
| reliability is linked to the batch more than to the individual
| drive models themselves. Often you will read online of people
| experiencing multiple failures from drives purchased from the
| same batch.
|
| I cannot prove this because I have no idea about Blackblazes
| procurement patterns but I bought one of the better drives in
| this list (ST16000NM001G) and it failed within a year.
|
| When it comes to hard drives or storage more generally a better
| approach is protect yourself against down time with software
| raid and backups and pray that if a drive does fail it does so
| within the warranty period.
| bragr wrote:
| >Often you will read online of people experiencing multiple
| failures from drives purchased from the same batch
|
| I'll toss in on that anecdata. This has happened to me a
| several times. In all these cases we were dealing with drives
| with more or less sequential serial numbers. In two instances
| they were just cache drives for our CDN nodes. Not a big
| deal, but I sure kept the remote hands busy those weeks
| trying to keep enough nodes online. In a prior job, it was
| our primary storage array. You'd think that RAID6+hot spare
| would be pretty robust, but 3 near simultaneous drive
| failures made a mockery of that. That was a bad day. The hot
| spare starting doing its thing with the first failure, and if
| it had finished rebuilding before the subsequent failures,
| we'd have been ok, but alas.
| tharkun__ wrote:
| This has been the "conventional wisdom" for a very long
| time. Is this one of those things that get "lost with time"
| and every generation has to rediscover it?
|
| Like, 25+ years ago I would've bought hard drives for just
| my personal usage in a software raid making sure I _don 't_
| get consecutive serial numbers, but ones that are very
| different. I'd go to my local hardware shop and ask them
| specifically for that. They'd show me the drives / serial
| numbers before I ever even bought them for real.
|
| I even used different manufacturers at some point when they
| didn't have non consecutive serials. I lost some storage
| because the drives weren't exactly the same size even
| though the advertized size matched, but better than having
| the RAID and extra cost be for nothing.
|
| I can't fathom how anyone that is running drives in actual
| production wouldn't have been doing that.
| itchyouch wrote:
| Exactly this.
|
| I mostly just buy multiple brands from multiple vendors.
| And size the partitions for mdadm a bit smaller.
|
| But even the same model where it's 2 each from bestbuy,
| Amazon, newegg, microcenter, seems to get me a nice
| assortment of variety.
| lazide wrote:
| It's inconvenient compared to just ordering 10x or
| however many of the same thing and not caring. The issue
| with variety too is different performance characteristics
| can make the array unpredictable.
|
| Of course, learned experience has value in the long term
| for a reason.
| Aachen wrote:
| I had to re-learn this as well. Nobody told me. Ordered
| two drives, worked great in tandem until their
| simultaneous demise. Same symptoms at the same time
|
| I rescued what could be rescued at a few KB/s read speed
| and then checked the serial numbers...
| tempest_ wrote:
| I personally like to get 1 of every animal if I can.
|
| I just get 1/3 Toshiba, 1/3 WD, 1/3 Seagate.
| FuriouslyAdrift wrote:
| Nearly every storage failure I've dealt with has been
| because of a failed RAID card (except for thousands of
| bad quantum bigfoot hard drives at IUPUI).
|
| Moving to software storage systems (ZFS, StorageSpaces,
| etc.) has saved my butt so many times.
| wil421 wrote:
| Same thing I did except I only wanted WD Red drives. I
| bought them from Amazon, Newegg, and Micro center.
| Thankfully none of them were those nasty SMR drives, not
| sure how I lucked out.
| dapperdrake wrote:
| My server survived multiple drive failures. ZFS on FreeBSD
| with mirroring. Simple. Robust. Effective. Zero downtime.
|
| Don't know about disk batches, though. Took used old second
| hand drives. (Many different batches due to procurement
| timelines.) Half of them was thrown out because they were
| clicky. All were tested with S.M.A.R.T. Took about a week.
| The ones that worked are mostly still around. Only a third of
| the ones that survived S.M.A.R.T. have failed so far.
| CTDOCodebases wrote:
| I didn't discover ZFS until recently. I played around with
| it on my HP Microserver around 2010/2011 but ultimately
| turned away from it because I wasn't confident I could
| recover the raw files from the drives if everything went
| belly up.
|
| Whats funny is that about a year ago I ended up installing
| FreeBSD onto the same Microserver and ran a 5 x 500GB
| mirror for my most precious data. The drives were ancient
| but not a single failure.
|
| As someone who never played with hardware raid ZFS blows my
| mind. The drive that failed was a non issue because the
| pool it belongs to was a pool with a single vdev (4 disk
| mirror). Due to the location of the server I had to shut
| down the system to pull the drive but yeah I think that was
| 2 weeks later. If this was the old days I would have had to
| source another drive and copy the data over.
| tempest_ wrote:
| ZFS is like magic.
|
| Every time I think I might need a feature in a file
| system it seems to have it.
| Kelvin506 wrote:
| IME heat is a significant factor with spindle drives. People
| will buy enterprise-class drives, then stick them in
| enclosures and computer cases that don't flow much air over
| it, leading to the motor and logic board getting much warmer
| than they should.
| MrDrMcCoy wrote:
| Heat is also a problem for flash. If you care about your
| data, you have to keep it cool and redundant.
| chuckledog wrote:
| This. My new Samsung T7 SSD overheated and took 4T of
| kinda priceless family photos with it. Thank you
| Backblaze for storing those backups for us! I missed the
| return window on the SSD so now have a little fan running
| to keep the thing from overheating again
| CTDOCodebases wrote:
| I have four of those drives mentioned and the one that did
| fail had the highest maximum temperature according to the
| SMART data. It was still within the specs though by about 6
| degrees Celsius.
|
| The drives are spaced apart by empty drive slots and have a
| 12cm case fan cranked to max blowing over it at all times.
|
| It is in a tower though so maybe it was bumped at some time
| and that caused the issue. Being in the top slot this would
| have had the greatest effect on the drive. I doubt it
| though.
|
| Usage is low and the drives are spinning 24/7.
|
| Still I think I am cursed when it comes to Seagate.
| 10729287 wrote:
| This is why it's best practice to buy your drives from
| different dealers when setting up RAID.
| cm2187 wrote:
| Well to me the report is mostly useful to illustrate the
| volatility of hard drive failure. It isn't a particular
| manufacturer or line of disks, it's all over the place.
|
| By the time Backblaze has a sufficient number of a particular
| model and sufficient time lapsed to measure failures, the
| drive is an obsolete model, so the report cannot really
| inform my decision for buying new drives. These are new drive
| stats, so not sure it is that useful for buying a used drive
| either, because of the bathtub shaped failure rate curve.
|
| So the conclusion I take from this report is that when a new
| drive comes out, you have no way to tell if it's going to be
| a good model, a good batch, so better stop worrying about it
| and plan for failure instead, because you could get a
| bad/damaged batch of even the best models.
| deelowe wrote:
| > I am becoming more and more convinced that hard drive
| reliability is linked to the batch more than to the
| individual drive models themselves.
|
| Worked in a component test role for many years. It's all of
| the above. We definitely saw significant differences in AFR
| across various models, even within the same product line,
| which were not specific to a batch. Sometimes simply having
| more or less platters can be enough to skew the failure rate.
| We didn't do in depth forensics models with higher AFRs as
| we'd just disqualify them and move on, but I always assumed
| it probably had something to do with electrical, mechanical
| (vibration/harmonics) or thermal differences.
| whalesalad wrote:
| hopefully you have 2x of these drives in some kind of raid
| mirror such that if one fails, you can simply replace it and
| re-mirror. not having something like this is risky.
| hypothesis wrote:
| Wasn't the issue with large drives that remaining drive has a
| high chance of failure during re-silvering?
| fc417fc802 wrote:
| If you're doing statistics to plan the configuration of a
| large cluster with high availability, then yes. For home
| use where failures are extremely rare, no.
|
| Home use is also much more likely to suffer from unexpected
| adverse conditions that impact all the drives in the array
| simultaneously.
| dapperdrake wrote:
| Just triple mirror with cheap drives from different
| manufacturers.
| HankB99 wrote:
| That may be true for pools that never get scrubbed. Or for
| management that doesn't watch SMART stats in order to catch
| a situation before it degrades to the point where one drive
| fails and another is on its last legs.
|
| With ZFS on Debian the default is to scrub monthly (second
| Sunday) and resilvering is not more stressful than that.
| The entire drive contents (not allocated space) has to be
| read to re-silver.
|
| Also define "high chance." Is 10% high? 60%? I've replaced
| failed drives or just ones I wanted to swap to a larger
| size at least a dozen times and never had a concurrent
| failure.
| loeg wrote:
| I switched to TLC flash last time around and no regrets. With
| QLC the situations where HDDs are cheaper, including the cost
| of power, are growing narrower and narrower.
| MrDrMcCoy wrote:
| It really depends on your usage patterns. Write-heavy
| workloads are still better cases for spinning rust due to how
| much harder they are on flash, especially at greater layer
| depths.
| Aachen wrote:
| Plus that SSDs apparently have a very dirty manufacturing
| process, worse than the battery or screen in your laptop. I
| recently learned this because the EU is starting to require
| reporting CO2e for products (mentioned on a Dutch podcast:
| https://tweakers.net/geek/230852/tweakers-
| podcast-356-switch...). I don't know how a hard drive
| stacks up but if the SSD is the worst of all of a laptop's
| components, odds are that it's better and so one could make
| the decision to use one or the other based on whether an
| SSD is needed rather than just tossing it in because it's
| cheap
|
| Probably it also matters if you get a bulky 3.5" HDD when
| all you need is a small flash chip with a few GB of
| persistent storage -- the devil is in the details but I
| simply didn't realise this could be a part of the decision
| process
| mercutio2 wrote:
| If this is really a significant concern for you, are you
| accounting for the CO2e of the (very significant)
| difference in energy consumption over the lifetime of the
| device?
|
| It seems unlikely to me that in a full lifecycle
| accounting the spinning rust would come out ahead.
| Aachen wrote:
| The figure already includes the lifetime energy
| consumption and it's comparatively insignificant. The
| calculation even includes expected disposal and
| recycling!
|
| It sounded really comprehensive besides having to make
| assumptions about standard usage patterns, but then the
| usage is like 10% of the lifetime emissions so it makes a
| comparatively small difference if I'm a heavy gamer or
| leave it to sit and collect dust: 90% remains the same
|
| > If this is really a significant concern for you
|
| It literally affects everyone I'm afraid and simply not
| knowing about it (until now) doesn't stop warming either.
| Yes, this concerns everyone, although not everyone has
| the means to do something about it (like to buy the
| cleaner product)
| pjdesno wrote:
| Um, no. Not unless you're still running ancient sub-1TB
| enterprise drives.
|
| It turns out that modern hard drives have a specified
| workload limit [1] - this is an artifact of heads being
| positioned at a low height (<1nm) over the platter during
| read and write operations, and a "safe" height (10nm?
| more?) when not transferring data.
|
| For an 18TB Exos X18 drive with a specified workload of
| 550TB read+write per year, assuming a lifetime of 5
| years[2] and that you never actually read back the data you
| wrote, this would be at max about 150 drive overwrites, or
| a total of 2.75PB transferred.
|
| In contrast the 15TB Solidigm D5-P5316, a read-optimized
| enterprise QLC drive, is rated for 10PB of random 64K
| writes, and 51PB of sequential writes.
|
| [1] https://products.wdc.com/library/other/2579-772003.pdf
|
| [2] the warrantee is 5 years, so I assume "<550TB/yr" means
| "bad things might happen after 2.75PB". It's quite possible
| that "bad things" are a lot less bad than what happens
| after 51PB of writes to the Solidigm drive, but if you
| exceed the spec by 18x to give you 51PB written, I would
| assume it would be quite bad.
| pjdesno wrote:
| ps: the white paper is old, I think head heights were 2nm
| back then. I'm pretty sure <1nm requires helium-filled
| drives, as the diameter of a nitrogen molecule is about
| 0.3nm
| Dylan16807 wrote:
| Nobody should ever have peace of mind about a single drive. You
| probably have odds around 5% that the storage drive fails each
| cycle, and another 5% for the OS drive. That's significant.
|
| And in your particular situation, 3 refurbished WUH721414ALE6L4
| are the same total price. If you put those in RAIDZ1 then
| that's 28TB with about as much reliability as you can hope to
| have in a single device. (With backups still being important
| but that's a separate topic.)
| meindnoch wrote:
| >You probably have odds around 5% that the storage drive
| fails each cycle
|
| What do you mean by cycle?
| Dylan16807 wrote:
| "My most recent drive [...] it should be fine for the next
| few years until it's time for the cycle to repeat."
|
| The amount of time they stay on a single drive.
| deelowe wrote:
| Drive manufacturers often publish the AFR. From there you
| can do the math to figure out what sort of redundancy you
| need. Rule of thumb is that the AFR should be in the 1-2%
| range. I haven't looked at BB's data, but I'm sure it
| supports this.
|
| Note, disk failure rates and raid or similar solutions
| should be used when establishing an availability target,
| not for protecting against data loss. If data loss is a
| concern, the approach should be to use back ups.
| Dylan16807 wrote:
| You picked a weird place to reply, because that comment
| is just saying what "cycle" means.
|
| But yes, I've done the math. I'm just going with the BB
| numbers here, and after a few years it adds up. The way I
| understand "peace of mind", you can't have it with a
| single drive. Nice and simple.
| Macha wrote:
| My understanding is that with the read error rate and
| capacity of modern hard drives, statistically you can't
| reliably rebuild a raid5/raidz1
| turtletontine wrote:
| Not an expert but I've heard this too. However - if this IS
| true, it's definitely only true for the biggest drives,
| operating in huge arrays. I've been running a btrfs raid10
| array of 4TB drives as a personal media and backup server
| for over a year, and it's been going just fine. Recently
| one of the cheaper drives failed, and I replaced it with a
| higher quality NAS grade drive. Took about 2days to rebuild
| the array, but it's been smooth sailing.
| Dylan16807 wrote:
| The bit error rates on spec sheets don't make much sense,
| and those analyses are wrong. You'd be unable to do a
| single full drive write and read without error, and with
| normal RAID you'd be feeding errors to your programs all
| the time even when no drives have failed.
|
| If you're regularly testing your drive's ability to be
| heavily loaded for a few hours, you don't have much chance
| of failure during a rebuild.
| _huayra_ wrote:
| I end up doing this too, but ensure that the "single data disk"
| is regularly backed up offsite too (several times a day, zfs
| send makes it easy). One needs an offsite backup anyway, and as
| long as your home server data workload isn't too high and you
| know how to restore (which should be practiced every so often),
| this can definitely work.
| Macha wrote:
| My home NAS drives are currently hitting the 5 years mark. So far
| I'm at no failures, but I'm considering if it's time to
| upgrade/replace. What I have is 5 x 4TB pre-SMR WD Reds (which
| are now called the WD Red Pro line I guess). Capacity wise I've
| got them setup in a RAID 6, which gives me 12TB of usable
| capacity, of which I currently use about 7.5TB.
|
| I'm basically mulling between going as-is to SSDs in a similar
| 5x4TB configuration, or just going for 20TB hard drives in a RAID
| 1 configuration and a pair of 4TB SATA SSDs in a RAID 1 for use
| cases that need better-than-HDD performance.
|
| These figures indicate Seagate is improving in reliability, which
| might be worth considering this time given WD's actions in the
| time since my last purchase, but on the other hand I'd basically
| sworn off Seagate after a wave of drives in the mid-2010s with a
| near 100% failure rate within 5 years.
| vednig wrote:
| Blackblaze is one of the most respected services in Storage
| industry, they've kept gaining my respect even after I launched
| my own cloud storage solution.
| atYevP wrote:
| Yev from Backblaze here -> thank you so much!
| bloopernova wrote:
| Google sells 2TB of space on Google drive for $10/month. I'm
| looking to move my data elsewhere.
|
| Can anyone recommend a European based alternative with a roughly
| similar cost?
| pranaysy wrote:
| Hetzner!
| staindk wrote:
| OneDrive space (through MS365 single or family licence) works
| out much cheaper in my country. I'm sure in the EU it is GDPR-
| compliant.
|
| YMMV but OneDrive has been improving a lot. Their web photos
| browsing is comparable to Google Photos these days.
| homarp wrote:
| linux sync works?
| MrDrMcCoy wrote:
| It's supported by rclone.
| guerby wrote:
| hetzner storage box $4 per month for 1 TB and $13 for 5 TB.
| bloopernova wrote:
| Good lord it even supports BorgBackup.
|
| Thank you very much!
| lukaslalinsky wrote:
| Be aware that it's just a single server. It's not
| replicated across multiple hosts like in the case of google
| drive. So you definitely want a backup of that if it's your
| primary copy.
| bloopernova wrote:
| Good point, thank you.
|
| It may actually be a good thing that it's not replicated.
| That forces me to really make sure I have a separate
| backup elsewhere.
| anotherhue wrote:
| US -> DE latency hurts though.
|
| I used them when I was in europe but migrated away after I
| came stateside.
|
| Not a problem for cold-storage/batch jobs of course.
| bloopernova wrote:
| Good to know, thank you!
| echoangle wrote:
| I'm assuming bloopernova is based in Europe, so latency
| should be fine. At least they asked for an European-based
| Hoster (although that could also theoretically be for
| privacy reasons).
| jillyboel wrote:
| you should still be able to saturate your bandwidth with
| poor latency
| anotherhue wrote:
| Not unless the protocol you use accounts for that. Smb
| for instance is tragic.
| jillyboel wrote:
| True, I was thinking of backup tools which tend to
| consider poor latency in their design.
| loeg wrote:
| Hard to argue with those WDC/Toshiba numbers. Seagate's are just
| embarrassing in contrast.
|
| (HGST drives -- now WDC -- were great, but those are legacy
| drives. It's been part of WD for some time. The new models are
| WDC branded.)
| RachelF wrote:
| ...and many used Seagate drives have been resold as new in the
| last 3 years. They were used for crypto mining and then had
| their SMART parameters wiped back to "new" 0 hours usage.
|
| https://www.heise.de/en/news/Hard-disk-fraud-Increasing-evid...
| loeg wrote:
| Sure, but it's hard to blame that on Seagate. The AFR is
| their fault.
| michaelt wrote:
| It's not Seagate's fault, but it would behove them to clamp
| down on such activity by authorised resellers.
|
| After all, it's not just the buyer getting ripped off; it's
| also Seagate. A customer paid for a brand new Seagate drive
| and Seagate didn't see a penny of it.
| loeg wrote:
| Yes. Nothing would lead me to believe Seagate _isn 't_
| fervently working to shut down fraudulent resellers
| behind the scenes.
| numpad0 wrote:
| Seagate has always been the "you get what you pay for" &&
| high replacement availability option, at least since the Thai
| flood and ST3000DM001 days - they kept shipping drives. It
| was always HGST > Toshiba > Seagate in both price and MTBF,
| with WD somewhere in between.
| rogerrogerr wrote:
| Is crypto mining a high storage IO operation? I always
| thought it was hard on CPU and RAM, but not on disk IO.
| userbinator wrote:
| Seagate seems to be very much hit-or-miss.
| bhouston wrote:
| Based on the data, it seems they have 4.4 petabytes of storage
| under management. Neat.
|
| https://docs.google.com/spreadsheets/d/1E4MS84SbSwWILVPAgeIi...
| selectodude wrote:
| Exabytes. 4.4 exabytes.
| userbinator wrote:
| An amazing amount if you consider that 16EB is the amount of
| data a 64-bit quantity can address, and this is over a
| quarter of that.
| dekhn wrote:
| There was a dashboard where the total storage at Google was
| tracked and they had to update it from 64 bits for this
| reason... about a decade or more ago.
| echoangle wrote:
| Wow, that's a cool stat. I wonder if people will ever
| seriously use 16EB of memory in a single system and will
| need to change to a more-than-64-bit architecture or if 64
| bit is truly enough. This has ,,640k ought to be enough for
| anybody" potential (and I know he didn't say that).
| rwmj wrote:
| From 2011: https://rwmj.wordpress.com/2011/10/03/when-
| will-disk-sizes-g...
|
| nbdkit can emulate disks up to 2^63-1 which is also the
| same maximum size that the Linux kernel currently
| supports: https://rwmj.wordpress.com/2018/09/05/nbdkit-
| for-loopback-pt...
| https://rwmj.wordpress.com/2018/09/06/nbdkit-for-
| loopback-pt...
| fc417fc802 wrote:
| > need to change to a more-than-64-bit architecture
|
| Perhaps we don't need a single flat address space with
| byte-addressable granularity at those sizes?
|
| I wonder how an 8 bit byte, 48 bit word system would have
| fared. 2*32 is easy to exhaust in routine tasks; 2*48 not
| so much.
| wtallis wrote:
| Until Intel's _Ice Lake_ server processors introduced in
| 2019, x86-64 essentially _was_ a 48-bit address
| architecture: addresses are stored in 64-bit registers,
| but were only valid if the top two bytes were sign-
| extended from the last bit of the 48-bit address. Now
| they support 57 bit addressing.
| zipy124 wrote:
| I believe google surpass this metric, specifically
| because of things like android phone back-ups, google
| photos, drive, youtube etc....
| pabs3 wrote:
| RISC-V has got us covered with the RV128 variant.
| m3nu wrote:
| 4,414,142 TB = 4.4 Exabyte
| theandrewbailey wrote:
| A petabyte is 1,000 terabytes. 4.4 petabytes wouldn't come
| anywhere near Backblaze's storage needs.
| remram wrote:
| Nowadays you can get a petabyte in a single machine (50 drives
| 20TB each).
| malfist wrote:
| Out of curiosity, what server cases can actually accommodate
| 50 drives?
| walrus01 wrote:
| If you google "supermicro 72 drive server" it's definitely
| a thing that exists, but these use double-length drive
| trays where each tray contains two drives. Meaning that you
| need a "whole machine can go down" software architecture of
| redundancy at a very large scale to make these useful,
| since pulling one tray to replace a drive will take two
| drives offline. More realistically the normal version of
| the same supermicro chassis which has 1 drive per tray is
| 36 drives in 1 server.
|
| There are other less publicly well known things with 72 to
| 96 drive trays in a single 'server' which are manufactured
| by taiwanese OEMs for large scale operators. The supermicro
| is just the best visual example I can think of right now
| with a well laid out marketing webpage.
|
| edit: some photos
|
| https://www.servethehome.com/supermicro-
| ssg-6047r-e1r72l-72x...
| Lammy wrote:
| > Meaning that you need a "whole machine can go down"
| software architecture of redundancy at a very large scale
| to make these useful
|
| Also some serious cooling to avoid the drives in the
| front cooking the drives in the back (assuming front-to-
| back airflow).
| numpad0 wrote:
| You don't LEGO assemble rackmount servers. Chassis come
| with figurative array of jet engines with 12V/0.84A -ish
| fans that generate characteristic ecstatic harmony.
| They're designed, supposedly, to take 35C air to keep
| drives in front at 40C and GPUs at back <95C.
| Lammy wrote:
| > You don't LEGO assemble rackmount servers.
|
| You may not, but plenty of people do.
| remram wrote:
| Those are specialized NAS chassis. We have a number of
| them, 4U size, too heavy to move when the drives are in.
|
| edit: They look like this:
| https://knowledgebase.45drives.com/wp-
| content/uploads/2019/0... (image from ddg)
| fc417fc802 wrote:
| > too heavy to move when the drives are in
|
| Careful not to drop it.
| manquer wrote:
| https://www.backblaze.com/cloud-storage/resources/storage-
| po...
|
| This is what backblaze use themselves currently able to
| hold 60 drives .
| Lammy wrote:
| iStarUSA are my go-to for my whitebox server builds, and
| they sell a 9U 50-drive hot-swap enclosure: https://www.ist
| arusa.com/en/istarusa/products.php?model=E9M5...
| justsomehnguy wrote:
| aic j4108 for 108 drives.
|
| Not a server per se, but you just take one 1U server and
| daisy chain a lot of those JBOD chassis for the needed
| capacity. You can have 1080 disks in a 42" rack.
| manquer wrote:
| Backblaze custom designed and uses what they call a pod
| holding 60units for a long time now
|
| https://www.backblaze.com/cloud-storage/resources/storage-
| po...
| quintin wrote:
| It continues to surprise me why Backblaze still trades at a
| fraction of its peak COVID share price. A well-managed company
| with solid fundamentals, strong IP and growing.
| devoutsalsa wrote:
| Because they are bleeding money and they must sell stock to
| stay in business. Cool product, but I personally don't want to
| buy something that doesn't turn a profit and has negative free
| cash flow.
| crazygringo wrote:
| I feel very confident that in 30 years AWS, Azure and Google
| Cloud will still be operating and profitable.
|
| I think there's a very small chance that Backblaze will be.
|
| Nothing against them, but it's virtually impossible to compete
| long-term with the economies of scale, bundling and network
| effects of the major cloud providers.
| manquer wrote:
| Cloud providers AWS in particular uses storage and transfer
| pricing as means of lock-in to other products , they can
| never be cost competitive to Backblaze , they has a thriving
| prosumer business .
|
| I don't think it is going anywhere.
| ww520 wrote:
| After couple failed hard disks in my old NVR, I've come to
| realize heat is the biggest enemy of hard disks. The NVR had to
| provide power to the POE cameras, ran video transcoding, and
| constantly writing to the disk. It generated a lot of heat. The
| disks were probably warped due to the heat and the disk heads
| crashed onto the surface, causing data loss.
|
| For my new NVR, the POE power supply is separated out to a
| powered switch, the newer CPU can do hardware video encoding, and
| I used SSD for first stage writing and hard disks as secondary
| backup. The heat has gone way down. So far things have run well.
| I know constant rewriting on SSD is bad, but the MTBF of SSD
| indicates it will be a number of years before failing. It's an
| acceptable risk.
| walrus01 wrote:
| That seems like a very poor chassis design on the part of the
| NVR manufacturer. The average modern 3.5" high capacity HDD
| doesn't generate that much heat. Even 'datacenter' HGST drives
| average around 5.5W and will top out at 7.8W TDP under maximum
| stress. Designing a case that uses relatively low rpm, quiet,
| 120 or 140mm 12VDC fans to pull air through it and cool six or
| eight hard drives isn't that difficult. In a midtower desktop
| PC case set up as a NAS with a low wattage CPU, used as a NAS,
| a single 140mm fan at the rear sucking air from front-to-back
| is often quite enough to cool eight 3.5" HDD.
|
| But equipment designers keep trying to stuff things into spaces
| that are too small and use inadequate ventilation.
| ww520 wrote:
| In a combination of heat, the POE cameras draw quite a bit of
| power, the video transcoding, and then the constant disk
| writes, all in a small slim case. It ran very hot during
| summer.
| userbinator wrote:
| That ST12000NM0007 is a little worrying. Looks like there are
| still pretty significant differences between manufacturers.
| pinoy420 wrote:
| This is such a fantastic piece of research. Thank you if you are
| reading. I wish amazon and Microsoft did similar
| atYevP wrote:
| Yev from Backblaze here -> you're welcome! Glad you like and we
| also wish they did! That's one of the reasons we started doing
| it, we wanted to know :D
| Melatonic wrote:
| True enterprise drives ftw - even Seagate usually makes some very
| reliable ones. They also tend to be a little faster. Some people
| have complained about noise but I have never noticed.
|
| They are noticeable much heavier in hand (and supposedly most use
| dual bearings).
|
| Combined with selecting based on Backblazes statistics I have had
| no HDD failures in years
| SirMaster wrote:
| One of these blogs literally told us that enterprise drives
| were no better.
|
| https://www.backblaze.com/blog/enterprise-drive-reliability/
| Melatonic wrote:
| Seems outdated - a lot of the drives in the 2024 statistics
| are enterprise drives - so they are using them
| tredre3 wrote:
| I'm not sure I follow you, are you really saying that your
| choice of Seagate was based on Backblaze's statistics? Maybe
| I'm missing something but aren't they the overall least
| reliable brand in their tables?
| Melatonic wrote:
| The drive with the lowest fail rate at the above link looks
| like it is a Seagate enterprise drive (ST16000NM002J)
| bigtimesink wrote:
| I used to think these were interesting and used them to inform my
| next HDD purchase. I realized I only used them to pick a recently
| reliable brand, we're down to three, and the stats are mostly old
| models, so the main use is if you're buying a used drive from the
| same batch that Backblaze happens to have also used.
|
| Buy two from different vendors and RAID or do regular off-site
| backups.
| Kab1r wrote:
| > RAID or do regular off-site backups.
|
| RAID is not a backup! Do both.
| lotharcable2 wrote:
| Mirrored raid is good. Other raid levels are of dubious value
| nowadays.
|
| Ideally you use "software raid" or file system with the
| capabilities do scrubbing and repair to detect bitrot. Or have
| some sort of hardware solution that can do the same and notify
| the OS of the error correction.
|
| And, as always, Raid-type solutions mostly exist to improve
| availability.
|
| Backups are something else entirely. Nothing beats having lots
| of copies in different places.
| mastax wrote:
| I bought a bunch of _28_ TB Seagate Exos drives refurbished for
| not that much money. I still can 't believe that 28TB drives are
| even possible.
| code_biologist wrote:
| Saw this recently: "Seagate: 'new' hard drives used for tens of
| thousands of hours":
| https://news.ycombinator.com/item?id=42864788
|
| Check your FARM logs. It sounds like people who were using the
| drives to mine the Chia cryptocurrency are dumping large
| capacity drives as Chia's value has fallen.
| textlapse wrote:
| Great to see this every year.
|
| Although a minor pet peeve (knowing this is free): I would have
| loved to see a 'in-use meter' in addition to just 'the drive was
| kept powered on'. AFR doesn't make sense for a HDD unless we know
| how long and how frequently the drives were being used (# of
| reads/writes or bytes/s).
|
| If all of them had a 99% usage through the entire year - then
| sure (really?).
| MrDrMcCoy wrote:
| Probably can't say too much, but I know that the I/O on these
| drives stays pretty consistently high. Enough so that Backblaze
| has to consider staying on smaller drives due to rebuild times
| and the fact that denser drives really don't stand up to as
| much abuse.
| frontierkodiak wrote:
| I've owned 17 Seagate ST12000NM001G (12TB SATA) drives over the
| last 24mos in a big raidz3 pool. My personal stats, grouping by
| the first 3-4 SN characters: - 5/8 ZLW2s failed - 1/4 ZL2s - 1/2
| ZS80 - 0/2 ZTN - 0/1 ZLW0 All drives were refurbs. Two from the
| Seagate eBay store, all others from ServerPartDeals. 7/15 of the
| drives I purchases from ServerPartDeals have failed, at least
| four of those failures have been within 6 weeks of installation.
|
| I originally used the Backblaze when selecting the drive I'd
| build my storage pool around. Every time the updated stats pop up
| in my inbox, I check out the table and double-check that my
| drives are in fact the 001Gs.. the drives that Backblaze reports
| has having 0.99% AFR.. I guess the lesson is that YMMV.
| MathiasPius wrote:
| I think impact can have a big influence on mechanical hard
| drive longevity, so it could be that the way the
| ServerPartDeals drives were sourced, handled or shipped
| compromised them.
| lofaszvanitt wrote:
| "All drives were refurbs."
|
| refurbs have terrible reliability... They offer 5 year
| warranties, yet the replacements they send back have terrible
| quality......
| amelius wrote:
| Considering the bathtub curve, does this table mark a drive as
| bad if it fails in the first (e.g.) week?
|
| https://en.wikipedia.org/wiki/Bathtub_curve
| ChrisMarshallNY wrote:
| It's a bit odd. HGST always fares very well, in Backblaze stats,
| but I have actually had issues, over the years, in my own setup
| (Synology frames). Seagate has usually fared better.
|
| Might be the model of the drives. I use 4TB ones.
| tanelpoder wrote:
| Related - about a year ago or so, I read about a firmware related
| problem with some vendors SSDs. It was triggered by some uptime
| counter reaching (overflowing?) some threshold and the SSD just
| bricked itself. It's interesting because you could carefully
| spread out disks from the same batch across many different
| servers, but if you deployed & started up all these new servers
| around the same time, the buggy disks in them later _all_ failed
| around the same time too, when their time was up...
| viggity wrote:
| Polite data viz recommendation: don't use black gridlines in your
| tables. Make them a light gray. The gridlines do provide
| information (the organization of the data), but the more
| important information is the values. I'd also right align the
| drive failures so you can scan/compare consistently.
___________________________________________________________________
(page generated 2025-02-12 23:02 UTC)