[HN Gopher] Backblaze Drive Stats for Q3 2023
       ___________________________________________________________________
        
       Backblaze Drive Stats for Q3 2023
        
       Author : caution
       Score  : 235 points
       Date   : 2023-11-14 14:07 UTC (8 hours ago)
        
 (HTM) web link (www.backblaze.com)
 (TXT) w3m dump (www.backblaze.com)
        
       | hiatus wrote:
       | Surprised by the number of drives with 0 failures, though it
       | seems not all of the drives were run for the required time to
       | qualify for the rating.
       | 
       | > In Q3, six different drive models managed to have zero drive
       | failures during the quarter. But only the 6TB Seagate, noted
       | above, had over 50,000 drive days, our minimum standard for
       | ensuring we have enough data to make the AFR plausible.
        
         | kqr wrote:
         | I have always wondered why they don't use techniques from
         | survival analysis to be able to draw conclusions even from sets
         | with lower failure rates. Or for that matter to avoid slight
         | bias even for drives they do report.
        
           | andy4blaze wrote:
           | Andy Klein from Backblaze here. We have done some survival
           | analysis (kaplan-meier curves). In our case, we need to have
           | a reasonable number of failures over the observation period
           | to get decent results. You can take a look at some of our
           | work here: https://www.backblaze.com/blog/hard-drive-life-
           | expectancy/ and see if that is what you were expecting.
        
       | nicolaslem wrote:
       | The article discusses the impact of high absolute temperature on
       | the longevity of drives, however from my amateur knowledge the
       | range of temperature during a day is also an important factor.
       | 
       | I always assumed that having a stable 40degC is better than a
       | drive constantly swinging between 20degC and 40degC, so I am
       | surprised that the article only mentions alerts on reaching a
       | high threshold.
        
         | andy4blaze wrote:
         | Andy Klein from Backblaze here. Your point is a good one in
         | that temperature fluctuation can be an important factor. We
         | actually sample smart stats, which contain the temperature
         | attribute, multiple times a day looking for such changes. The
         | Drives Stats data is captured once a day, so it looks static,
         | but behind the scenes the monitoring is more dynamic.
        
           | rft wrote:
           | Do you have plans to look at whether higher fluctuations
           | translate into higher failure rates? Not sure whether you
           | have historical data on this, but I would be really
           | interested in this aspect, even if you can only run the stats
           | on a smaller number of drives or shorter time periods.
           | 
           | Maybe dividing the drives roughly in "higher than average
           | variability" and "low variability" and then looking at the
           | AFR for this subset can show some relation. Of course as the
           | AFR for many drives is already quite low, the effect might be
           | too small to distinguish from noise.
           | 
           | On the topic of temperatures: Have you run an analysis
           | whether a drive increasing in temperature (or maybe even
           | decreasing) compared to its base line and "neighbors" results
           | in a higher chance of failure?
        
       | hardware2win wrote:
       | Kinda low sample sizes for 0 and highest AFRs
        
       | dist-epoch wrote:
       | Is there any backup software which continuously uploads your
       | files to an AWS/GCP/Azure long term storage account that you
       | control and pay for? Something like CrashPlan, which from time to
       | time performs automatic maintenance and deletes old versions of
       | files.
        
         | kasperset wrote:
         | Arq backup. I have used it before in the past(2-3 years ago)
         | but not using it currently.
        
           | aeries wrote:
           | Arq is the best set-and-forget option I know of for macOS.
        
           | southernplaces7 wrote:
           | My experience with Arq has been terrible so far. The
           | interface keeps glitching up, for a while it failed
           | repeatedly until I gave it a unique permission in Windows
           | services, and the backup process is far from intuitive, with
           | multiple backups showing up as restore options, but each with
           | different file sizes and specific files saved. Overall, just
           | too many ambiguities for it to be reliable.
           | 
           | Oddly, SpiderOak has been a background go-to for years and
           | always worked smoothly for backing up everything I select in
           | a wholesale way, keeping a fairly clear record of what was
           | saved, removed or moved, and adjusting immediately to any
           | file changes or deletes I do. The SO interface is shitty and
           | often freezes, and lacks many basic features like being able
           | to see file sizes or scrolling through long file lists
           | easily, but at least overall, I can quickly and easily see
           | when backups are happening, how they're being done and what's
           | being saved. Also, for restoring files, it's surprisingly
           | fast despite a reputation held by many that it's slow.
        
         | eli wrote:
         | Tons of choices: Restic, Borg, Duplicity, Kopia
         | 
         | Though Backblaze is pretty good at what it does and you can set
         | your own encryption key, if that's the concern.
        
           | aeries wrote:
           | Unfortunately Backblaze requires[1] you to provide them this
           | private key to restore.
           | 
           | [1] https://help.backblaze.com/hc/en-
           | us/articles/360038171794-Wh...
        
             | ecwilson wrote:
             | What does restore mean this context? I've downloaded files
             | from their online portal without providing a key. Perhaps
             | restoring in this context means having them mail you a hard
             | drive.
        
               | eli wrote:
               | Really? I thought the file metadata is encrypted so it
               | needs the key to even identify what files are available
               | to restore.
        
             | eli wrote:
             | Yeah I mean you gotta trust them on some level. Backblaze
             | could also push a client update that nerfs the encryption.
             | If you've got really sensitive data I'd probably pick
             | something else.
        
             | davrosthedalek wrote:
             | You can always encrypt yourself and rclone to b2. Likely
             | cheaper if you are not a data hoarder.
        
         | danbtl wrote:
         | I'm using rclone to sync with Backblaze nightly, executed
         | directly from a cronjob.
        
           | freedomben wrote:
           | Same. Rclone is wonderful because it supports a ton of
           | different backends which makes it super easy to mirror. It's
           | also got some great features like crypt where you can encrypt
           | everything locally, thus sending the data all as ciphertext.
        
         | briffle wrote:
         | Backblaze software is pretty reasonable. But I have a linux
         | machine, so restic and their B2 storage is a buck or two a
         | month to backup a few computers in my house. (around 185GB of
         | photos, etc)
        
           | encom wrote:
           | >Backblaze software is pretty reasonable.
           | 
           | Hard disagree. The jankiness of their software was what made
           | me cancel my sub after several years. Firstly it doesn't
           | follow the OS date and number formatting. It's a minor thing,
           | but it's so annoying having to parse the dumb M/D/Y format
           | and comma thousands separator etc (being da-DK). It's not a
           | deal breaker, but on the other hand, it's such a low-hanging
           | thing that most other software gets right immediately.
           | 
           | But far more importantly, the BB client would some times just
           | decide to re-upload several hundred gigabytes of data that I
           | know for sure didn't change, which makes me wonder if it's
           | just the client being retarded or if the data got lost
           | server-side. And it takes absolutely forever to detect USB
           | harddrives being plugged in. And its log files will grow to
           | absurd sizes, and you're not allowed to purge them or the
           | client will become brain damaged. And one time I needed to do
           | a restore, it took literal days for BB to prepare it and I
           | had to get support involved. I feel I just can't trust the BB
           | stack, the client being the weakest link by far, and backup I
           | can't trust is worthless.
        
             | briffle wrote:
             | You are probably correct. But there is a reason I put
             | Backblaze client on my mom's laptop, rather than a
             | scheduled powershell task to run restic to a cloud storage.
             | And a scheduled task to run the prune weekly.
        
         | notRobot wrote:
         | Perhaps this can help https://rclone.org/
        
         | thebiss wrote:
         | Define "continuously". Does the data need to be mirrored
         | immediately upon write?
        
       | dehrmann wrote:
       | > The average age of the retired drives was just over eight years
       | 
       | Didn't realize they can last so long.
        
         | margalabargala wrote:
         | They can last much longer. I have operational IDE drives from
         | the early aughts (not with anything important on them)
        
         | thebiss wrote:
         | Under my desk right now is a too-underutilized-to-upgrade-NAS
         | that has been spinning 1TB Western Digitals since 2010. Between
         | RAID-Z2 and cloud backups, there's almost no reason to get rid
         | of them except for performance, which doesn't matter here.
        
           | toast0 wrote:
           | If all of your disks are about the same age, from the same
           | vendor, it might be reasonable to replace them over time to
           | mitigate the risk that a firmware or manufacturing issue
           | results in them all failing around the same time. Many
           | vendors have had firmware errors where counters rolled over
           | and the drive becomes inaccessible (generally much sooner
           | than 13 years though).
        
         | pathartl wrote:
         | I just retired some drives out of my home array. 3TB WD Reds
         | with 10.5 years of power on time, no logged errors. Ran them
         | through a full block check and had no errors.
        
       | Lorin wrote:
       | I wonder if they have special hardware recycling arrangements
       | with their vendors for decommissioned drives to reduce their
       | footprint. How many magnets are laying around the office? :)
        
       | 1-6 wrote:
       | My simple rule still stands even after 20 years: Avoid Seagate.
        
         | Mistletoe wrote:
         | I always found it hilarious when they were sponsoring the
         | datahoarder subreddit. I'd like to meet that marketer and shake
         | his or her hand.
        
           | 1-6 wrote:
           | I'm surprised that Seagate has consistently kept its
           | 'slightly less reliable' crown after all these years.
           | 
           | It's like the student who copies the 'A' student's exam.
           | They'll purposefully get a couple of the answers wrong to
           | avoid suspicion.
        
         | kstrauser wrote:
         | Conversely, I've had nothing but bad luck with WD the last few
         | years, and my Seagates have been flawless.
         | 
         | My simple rule is that all drives suck, and always have good
         | backups.
        
           | pcurve wrote:
           | Yup. All drives kick the bucket at some point. That's why I
           | use BB. I don't trust myself with NAS.
           | 
           | The ones that never failed me were any drives made by Quantum
           | using SCSI interface. 5 drives, zero failure over 10+ years.
           | But those were slower and cooler running units.
        
             | kstrauser wrote:
             | I've got a NAS backed up to Backblaze. It's a nice setup. I
             | can quickly recover from local data loss, or replace a RAID
             | disk when needed, but if the NAS gets hit by a bus then I
             | still haven't lost everything.
        
           | Netcob wrote:
           | My WD drives failed pretty consistently, so I'm now giving
           | Seagate a try.
           | 
           | Well, my main reason was that WD decided that just failing
           | "naturally" after a few years wasn't enough, but that a drive
           | having been on for 3 years should be considered the same as
           | "failing" (communicated through WDDA), which led to Synology
           | adopting that for a while. Not sure what the current state is
           | of that, but I intend to swap drives when they fail, not when
           | they turn 3.
        
         | ksec wrote:
         | Not quite as long as 20 years, but for the past 15 years: Just
         | buy HGST.
        
           | AnonC wrote:
           | HGST was bought by WD long ago. Has that had any impact on
           | the types of drives sold or the quality?
        
             | ksec wrote:
             | >has that had any impact on the types of drives sold or the
             | quality?
             | 
             | No. To the point when WD tries to rebrand those drives away
             | from HGST, the market demanded HGST and they brought it
             | back a year later.
        
               | HankB99 wrote:
               | At what point do they rebrand crappy WD drives as HGST?
               | 
               | They already have a reputation of obfuscating information
               | (SMR vs. CMR)
        
         | AnonC wrote:
         | My simple rule in the same period: avoid WD at all costs,
         | prefer HGST (which later became part of WD) and use Seagate
         | mostly. I've seen variations of these personal rules, and
         | during a time when Backblaze didn't exist yet (and came up with
         | better measurements on a larger scale), it was like one of
         | those holy wars between tabs and spaces (or vi and emacs).
        
         | Dalewyn wrote:
         | I used to swear by WD HDDs, but when I literally couldn't tell
         | which of their NAS drives were CMR and SMR a few years ago I
         | wrote them off and went to Seagate who clearly labeled their
         | drives.
         | 
         | Combine that with their lackluster reputation in solid state as
         | of late and I probably won't buy their HDDs until Seagate one
         | day gives me the _" WTF are you even selling?"_ rigmarole too.
        
         | gosub100 wrote:
         | They had a couple bad runs a few years back. If you keep
         | following your simple rule, you'll eventually get a bad WD and
         | not be able to use spinning platters at all. A better simple
         | rule would be to never skimp on buying out-of-warranty drives,
         | as well as having a proper backup regimen. I've had bad
         | seagates back in ~2008 but my current nas has 5x 16T exos and
         | they work fine.
        
       | icelancer wrote:
       | The best blog series going. Great technical writeups that I wish
       | more companies would do - we've been doing it at our small
       | business and customers really get a lot out of it. Also helps
       | with marketing.
        
       | vouaobrasil wrote:
       | Backblaze is interesting but it's not very easy to use. Its
       | interface is rather basic and it was difficult to select which
       | drives to back up and which not to. It kept trying to back up
       | directories on my computer that I specifically told it not to,
       | and there was no way to efficiently update program's behaviouir
       | from the the "what to backup list". It might be nice if you just
       | want to backup your computer and all your drives but the moment
       | you want only parts of your computer backed up, it's frustrating.
        
       | cvccvroomvroom wrote:
       | I run a bunch of WUH721414ALE6L4 and WUH721414ALE604 in numerous
       | RAID10 volumes. Haven't had a failure yet.
       | 
       | L = without power disable, 0 = with
        
       | BOOSTERHIDROGEN wrote:
       | > this chart is the confidence interval, which is the difference
       | between the low and high AFR confidence levels calculated at 95%.
       | 
       | How to calculate low and high AFR ?
        
       | Titan2189 wrote:
       | > ambient temperature within a data center often increases during
       | the summer months
       | 
       | What? Isn't your Data Center supposed to be temperature
       | controlled, where the A/C has a setpoint which it's keeping the
       | entire environment to within a degree?
       | 
       | Being able to tell what season it is based on your HDD SMART
       | temperature (armchair expert here) sounds bad.
        
         | brnt wrote:
         | Is moving the setpoint with the seasons, within tolerance of
         | course, not a common energy savings method?
        
           | ska wrote:
           | That makes more sense when you are housing people than
           | servers, no?
        
       ___________________________________________________________________
       (page generated 2023-11-14 23:01 UTC)