[HN Gopher] Backblaze Drive Stats for Q3 2023
___________________________________________________________________
Backblaze Drive Stats for Q3 2023
Author : caution
Score : 235 points
Date : 2023-11-14 14:07 UTC (8 hours ago)
(HTM) web link (www.backblaze.com)
(TXT) w3m dump (www.backblaze.com)
| hiatus wrote:
| Surprised by the number of drives with 0 failures, though it
| seems not all of the drives were run for the required time to
| qualify for the rating.
|
| > In Q3, six different drive models managed to have zero drive
| failures during the quarter. But only the 6TB Seagate, noted
| above, had over 50,000 drive days, our minimum standard for
| ensuring we have enough data to make the AFR plausible.
| kqr wrote:
| I have always wondered why they don't use techniques from
| survival analysis to be able to draw conclusions even from sets
| with lower failure rates. Or for that matter to avoid slight
| bias even for drives they do report.
| andy4blaze wrote:
| Andy Klein from Backblaze here. We have done some survival
| analysis (kaplan-meier curves). In our case, we need to have
| a reasonable number of failures over the observation period
| to get decent results. You can take a look at some of our
| work here: https://www.backblaze.com/blog/hard-drive-life-
| expectancy/ and see if that is what you were expecting.
| nicolaslem wrote:
| The article discusses the impact of high absolute temperature on
| the longevity of drives, however from my amateur knowledge the
| range of temperature during a day is also an important factor.
|
| I always assumed that having a stable 40degC is better than a
| drive constantly swinging between 20degC and 40degC, so I am
| surprised that the article only mentions alerts on reaching a
| high threshold.
| andy4blaze wrote:
| Andy Klein from Backblaze here. Your point is a good one in
| that temperature fluctuation can be an important factor. We
| actually sample smart stats, which contain the temperature
| attribute, multiple times a day looking for such changes. The
| Drives Stats data is captured once a day, so it looks static,
| but behind the scenes the monitoring is more dynamic.
| rft wrote:
| Do you have plans to look at whether higher fluctuations
| translate into higher failure rates? Not sure whether you
| have historical data on this, but I would be really
| interested in this aspect, even if you can only run the stats
| on a smaller number of drives or shorter time periods.
|
| Maybe dividing the drives roughly in "higher than average
| variability" and "low variability" and then looking at the
| AFR for this subset can show some relation. Of course as the
| AFR for many drives is already quite low, the effect might be
| too small to distinguish from noise.
|
| On the topic of temperatures: Have you run an analysis
| whether a drive increasing in temperature (or maybe even
| decreasing) compared to its base line and "neighbors" results
| in a higher chance of failure?
| hardware2win wrote:
| Kinda low sample sizes for 0 and highest AFRs
| dist-epoch wrote:
| Is there any backup software which continuously uploads your
| files to an AWS/GCP/Azure long term storage account that you
| control and pay for? Something like CrashPlan, which from time to
| time performs automatic maintenance and deletes old versions of
| files.
| kasperset wrote:
| Arq backup. I have used it before in the past(2-3 years ago)
| but not using it currently.
| aeries wrote:
| Arq is the best set-and-forget option I know of for macOS.
| southernplaces7 wrote:
| My experience with Arq has been terrible so far. The
| interface keeps glitching up, for a while it failed
| repeatedly until I gave it a unique permission in Windows
| services, and the backup process is far from intuitive, with
| multiple backups showing up as restore options, but each with
| different file sizes and specific files saved. Overall, just
| too many ambiguities for it to be reliable.
|
| Oddly, SpiderOak has been a background go-to for years and
| always worked smoothly for backing up everything I select in
| a wholesale way, keeping a fairly clear record of what was
| saved, removed or moved, and adjusting immediately to any
| file changes or deletes I do. The SO interface is shitty and
| often freezes, and lacks many basic features like being able
| to see file sizes or scrolling through long file lists
| easily, but at least overall, I can quickly and easily see
| when backups are happening, how they're being done and what's
| being saved. Also, for restoring files, it's surprisingly
| fast despite a reputation held by many that it's slow.
| eli wrote:
| Tons of choices: Restic, Borg, Duplicity, Kopia
|
| Though Backblaze is pretty good at what it does and you can set
| your own encryption key, if that's the concern.
| aeries wrote:
| Unfortunately Backblaze requires[1] you to provide them this
| private key to restore.
|
| [1] https://help.backblaze.com/hc/en-
| us/articles/360038171794-Wh...
| ecwilson wrote:
| What does restore mean this context? I've downloaded files
| from their online portal without providing a key. Perhaps
| restoring in this context means having them mail you a hard
| drive.
| eli wrote:
| Really? I thought the file metadata is encrypted so it
| needs the key to even identify what files are available
| to restore.
| eli wrote:
| Yeah I mean you gotta trust them on some level. Backblaze
| could also push a client update that nerfs the encryption.
| If you've got really sensitive data I'd probably pick
| something else.
| davrosthedalek wrote:
| You can always encrypt yourself and rclone to b2. Likely
| cheaper if you are not a data hoarder.
| danbtl wrote:
| I'm using rclone to sync with Backblaze nightly, executed
| directly from a cronjob.
| freedomben wrote:
| Same. Rclone is wonderful because it supports a ton of
| different backends which makes it super easy to mirror. It's
| also got some great features like crypt where you can encrypt
| everything locally, thus sending the data all as ciphertext.
| briffle wrote:
| Backblaze software is pretty reasonable. But I have a linux
| machine, so restic and their B2 storage is a buck or two a
| month to backup a few computers in my house. (around 185GB of
| photos, etc)
| encom wrote:
| >Backblaze software is pretty reasonable.
|
| Hard disagree. The jankiness of their software was what made
| me cancel my sub after several years. Firstly it doesn't
| follow the OS date and number formatting. It's a minor thing,
| but it's so annoying having to parse the dumb M/D/Y format
| and comma thousands separator etc (being da-DK). It's not a
| deal breaker, but on the other hand, it's such a low-hanging
| thing that most other software gets right immediately.
|
| But far more importantly, the BB client would some times just
| decide to re-upload several hundred gigabytes of data that I
| know for sure didn't change, which makes me wonder if it's
| just the client being retarded or if the data got lost
| server-side. And it takes absolutely forever to detect USB
| harddrives being plugged in. And its log files will grow to
| absurd sizes, and you're not allowed to purge them or the
| client will become brain damaged. And one time I needed to do
| a restore, it took literal days for BB to prepare it and I
| had to get support involved. I feel I just can't trust the BB
| stack, the client being the weakest link by far, and backup I
| can't trust is worthless.
| briffle wrote:
| You are probably correct. But there is a reason I put
| Backblaze client on my mom's laptop, rather than a
| scheduled powershell task to run restic to a cloud storage.
| And a scheduled task to run the prune weekly.
| notRobot wrote:
| Perhaps this can help https://rclone.org/
| thebiss wrote:
| Define "continuously". Does the data need to be mirrored
| immediately upon write?
| dehrmann wrote:
| > The average age of the retired drives was just over eight years
|
| Didn't realize they can last so long.
| margalabargala wrote:
| They can last much longer. I have operational IDE drives from
| the early aughts (not with anything important on them)
| thebiss wrote:
| Under my desk right now is a too-underutilized-to-upgrade-NAS
| that has been spinning 1TB Western Digitals since 2010. Between
| RAID-Z2 and cloud backups, there's almost no reason to get rid
| of them except for performance, which doesn't matter here.
| toast0 wrote:
| If all of your disks are about the same age, from the same
| vendor, it might be reasonable to replace them over time to
| mitigate the risk that a firmware or manufacturing issue
| results in them all failing around the same time. Many
| vendors have had firmware errors where counters rolled over
| and the drive becomes inaccessible (generally much sooner
| than 13 years though).
| pathartl wrote:
| I just retired some drives out of my home array. 3TB WD Reds
| with 10.5 years of power on time, no logged errors. Ran them
| through a full block check and had no errors.
| Lorin wrote:
| I wonder if they have special hardware recycling arrangements
| with their vendors for decommissioned drives to reduce their
| footprint. How many magnets are laying around the office? :)
| 1-6 wrote:
| My simple rule still stands even after 20 years: Avoid Seagate.
| Mistletoe wrote:
| I always found it hilarious when they were sponsoring the
| datahoarder subreddit. I'd like to meet that marketer and shake
| his or her hand.
| 1-6 wrote:
| I'm surprised that Seagate has consistently kept its
| 'slightly less reliable' crown after all these years.
|
| It's like the student who copies the 'A' student's exam.
| They'll purposefully get a couple of the answers wrong to
| avoid suspicion.
| kstrauser wrote:
| Conversely, I've had nothing but bad luck with WD the last few
| years, and my Seagates have been flawless.
|
| My simple rule is that all drives suck, and always have good
| backups.
| pcurve wrote:
| Yup. All drives kick the bucket at some point. That's why I
| use BB. I don't trust myself with NAS.
|
| The ones that never failed me were any drives made by Quantum
| using SCSI interface. 5 drives, zero failure over 10+ years.
| But those were slower and cooler running units.
| kstrauser wrote:
| I've got a NAS backed up to Backblaze. It's a nice setup. I
| can quickly recover from local data loss, or replace a RAID
| disk when needed, but if the NAS gets hit by a bus then I
| still haven't lost everything.
| Netcob wrote:
| My WD drives failed pretty consistently, so I'm now giving
| Seagate a try.
|
| Well, my main reason was that WD decided that just failing
| "naturally" after a few years wasn't enough, but that a drive
| having been on for 3 years should be considered the same as
| "failing" (communicated through WDDA), which led to Synology
| adopting that for a while. Not sure what the current state is
| of that, but I intend to swap drives when they fail, not when
| they turn 3.
| ksec wrote:
| Not quite as long as 20 years, but for the past 15 years: Just
| buy HGST.
| AnonC wrote:
| HGST was bought by WD long ago. Has that had any impact on
| the types of drives sold or the quality?
| ksec wrote:
| >has that had any impact on the types of drives sold or the
| quality?
|
| No. To the point when WD tries to rebrand those drives away
| from HGST, the market demanded HGST and they brought it
| back a year later.
| HankB99 wrote:
| At what point do they rebrand crappy WD drives as HGST?
|
| They already have a reputation of obfuscating information
| (SMR vs. CMR)
| AnonC wrote:
| My simple rule in the same period: avoid WD at all costs,
| prefer HGST (which later became part of WD) and use Seagate
| mostly. I've seen variations of these personal rules, and
| during a time when Backblaze didn't exist yet (and came up with
| better measurements on a larger scale), it was like one of
| those holy wars between tabs and spaces (or vi and emacs).
| Dalewyn wrote:
| I used to swear by WD HDDs, but when I literally couldn't tell
| which of their NAS drives were CMR and SMR a few years ago I
| wrote them off and went to Seagate who clearly labeled their
| drives.
|
| Combine that with their lackluster reputation in solid state as
| of late and I probably won't buy their HDDs until Seagate one
| day gives me the _" WTF are you even selling?"_ rigmarole too.
| gosub100 wrote:
| They had a couple bad runs a few years back. If you keep
| following your simple rule, you'll eventually get a bad WD and
| not be able to use spinning platters at all. A better simple
| rule would be to never skimp on buying out-of-warranty drives,
| as well as having a proper backup regimen. I've had bad
| seagates back in ~2008 but my current nas has 5x 16T exos and
| they work fine.
| icelancer wrote:
| The best blog series going. Great technical writeups that I wish
| more companies would do - we've been doing it at our small
| business and customers really get a lot out of it. Also helps
| with marketing.
| vouaobrasil wrote:
| Backblaze is interesting but it's not very easy to use. Its
| interface is rather basic and it was difficult to select which
| drives to back up and which not to. It kept trying to back up
| directories on my computer that I specifically told it not to,
| and there was no way to efficiently update program's behaviouir
| from the the "what to backup list". It might be nice if you just
| want to backup your computer and all your drives but the moment
| you want only parts of your computer backed up, it's frustrating.
| cvccvroomvroom wrote:
| I run a bunch of WUH721414ALE6L4 and WUH721414ALE604 in numerous
| RAID10 volumes. Haven't had a failure yet.
|
| L = without power disable, 0 = with
| BOOSTERHIDROGEN wrote:
| > this chart is the confidence interval, which is the difference
| between the low and high AFR confidence levels calculated at 95%.
|
| How to calculate low and high AFR ?
| Titan2189 wrote:
| > ambient temperature within a data center often increases during
| the summer months
|
| What? Isn't your Data Center supposed to be temperature
| controlled, where the A/C has a setpoint which it's keeping the
| entire environment to within a degree?
|
| Being able to tell what season it is based on your HDD SMART
| temperature (armchair expert here) sounds bad.
| brnt wrote:
| Is moving the setpoint with the seasons, within tolerance of
| course, not a common energy savings method?
| ska wrote:
| That makes more sense when you are housing people than
| servers, no?
___________________________________________________________________
(page generated 2023-11-14 23:01 UTC)