[HN Gopher] Poor Disk Performance
       ___________________________________________________________________
        
       Poor Disk Performance
        
       Author : latch
       Score  : 176 points
       Date   : 2021-05-10 08:55 UTC (14 hours ago)
        
 (HTM) web link (www.brendangregg.com)
 (TXT) w3m dump (www.brendangregg.com)
        
       | YourMeds wrote:
       | Can we talk about how Microsoft made Windows 10 unusable on hard
       | drives?
        
       | h2odragon wrote:
       | somewhere i have a stack of platters from a 5in SCSI drive with a
       | circle in the middle where the heads crashed and peeled the
       | coating off the disk.
        
       | londons_explore wrote:
       | Perhaps the dust particle causes the head to jump up and back
       | down again, but the disruption to the data stream being read
       | isn't sufficient to prevent ecc returning good data?
        
       | kstrauser wrote:
       | One time an acquaintance sold me a 50 MB SCSI drive filled with,
       | umm, a selection of Amiga games (all PD, honest, officer!). When
       | I got it home and installed, the drive howled like a banshee and
       | a benchmark program said I was getting about 100KB/s reads from
       | it. Figuring I had absolutely nothing to lose, I flipped it over
       | and squirted a little 3-in-1 oil on the spindle bearing. The
       | whine's pitch started increasing and quieting as the drive spun
       | up to its full operating speed, and I watched the little graph
       | slowly work its way up to a more reasonable 1MB/s. I made backups
       | of the software on it then turned the system off, pulled the
       | drive, and threw it away.
       | 
       | I have never before or since oiled a piece of computer hardware
       | to improve its IO, but this one time it worked.
        
         | wiredfool wrote:
         | I remember putting an Apple 20MB hard drive on a heater to warm
         | up the lubricants so that it would spin up.
        
       | flakiness wrote:
       | I didn't expect Brendan Gregg talking about anything but Cloud
       | anymore, but here here is! I appreciate his curiosity-chasing
       | storytelling.
       | 
       | I wonder how to "read over 99.9999% of disk sectors
       | successfully". Is there any handy script to do this without harm?
       | Then I can try these tools locally on my Ubuntu laptop to see the
       | numbers.
        
         | aidenn0 wrote:
         | ddrescue is pretty good; it strides over the disk reading in
         | big chunks, and writes out a map file of which chunks failed.
         | Then it goes sector-by-sector over the failed chunks to fill-in
         | the holes.
        
           | flakiness wrote:
           | It seems a scary option to try casually :-/
           | 
           | > Never try to rescue a r/w mounted partition. The resulting
           | copy may be useless. It is best that the device or partition
           | to be rescued is not mounted at all, not even read-only.
           | 
           | https://www.gnu.org/software/ddrescue/manual/ddrescue_manual.
           | ..
        
             | aidenn0 wrote:
             | Yes, if you make a raw copy of a disk you are also
             | concurrently writing to, you won't have a snapshot of the
             | disk. Nothing surprising there, right?
             | 
             | [edit]
             | 
             | Also that section isn't talking about damaging the original
             | disk, but rather ending up with a useless copy.
             | 
             | If you have a malfunctioning device then using it in any
             | way may cause further malfunctions, but in general running
             | something like ddrescue isn't going to destroy something
             | that wasn't already about to self-destruct anyways.
        
       | CraigJPerry wrote:
       | >> What's good for one user may be bad for another
       | 
       | I bought 3 "identical" old Dell T3610 workstations off eBay for a
       | home lab project. They have "identical" striped HDDs in them.
       | 
       | Kicking off a Fedora Core OS install on all 3 simultaneously
       | (which i've had to do a few times as i learned how ignition
       | works) results in the same ordering of the hosts finishing their
       | rebuilds -
       | 
       | Machine 2 always finishes first by around 3 minutes of a 14 min
       | wipe & reinstall process.
       | 
       | Machine 3 finishes ahead of machine 1 by around 45 seconds or so
       | usually.
       | 
       | Almost 5 minutes in a <20 mins process, that's huge! I still
       | don't actually know the root cause. Benchmarking disk I/O has
       | them within a few percent of each other. It's not that they're
       | contending to remotely load the OS installer - that gets cached
       | in memory at the start. There's a few seconds difference in the
       | UEFI bios startup timings but none of them are particularly
       | consistent which is weird, i would have thought UEFI init time
       | would be the same on a given host each boot, but there's a few
       | seconds in it each time.
        
         | userbinator wrote:
         | _There 's a few seconds difference in the UEFI bios startup
         | timings_
         | 
         | Could that be anything to do with the Intel ME or similar
         | "management" spyware/etc. trying to phone home or do something?
         | You may have to reflash the BIOS to "clean" that completely.
         | 
         | The other thing I can think of is that the CPU heatsinks aren't
         | clogged with the same amount of dust and causing one machine to
         | thermally throttle more than the others.
        
           | xen2xen1 wrote:
           | Thermals are probably the right answer.
        
         | bityard wrote:
         | I used to work at a web hosting company and one of my
         | responsibilities was managing all the VPS backups. We had a few
         | racks for "backup servers" which were all identical boxes. We
         | would provision CentOS on them two or three at a time and I
         | observed the exact same thing. Some servers were always just a
         | little faster than others. Never figured out why.
        
         | bentcorner wrote:
         | That's interesting. If you're curious, swapping disks might
         | tell you more. If it's thermals like other commenters have
         | suggested then you won't see a difference. (unless you knock
         | some loose!)
        
       | userbinator wrote:
       | No HDD on the market has ever been variable-speed. What you may
       | be perceiving as a change in RPM could just be the difference in
       | harmonics of the spindle motor as you change the dampening.
       | 
       | As for particles on the platters, the keywords to search for are
       | "thermal asperity". An excess of them, or ones that are actually
       | stuck to the platter, can cause damage but as you have noticed,
       | what usually happens is the head just knocks them out of the way
       | and heats up slightly, causing a misread and subsequent retry
       | (hence the slow speed). If you power on a working drive and then
       | remove the lid, the air currents will keep any new particles from
       | sticking to the pllatters.
       | 
       |  _By pushing down on the lid, however, (simulating screws) it
       | sped up and down a few times before failing. The harder I pushed
       | the less it vibrated and the more it worked, until I finally had
       | it returning I /O, albeit slowly._
       | 
       | It's more likely that you were adjusting the actuator angle. See
       | the bottom picture in this article for comparison (also a WD
       | drive from around the same era):
       | 
       | https://hddguru.com/articles/2006.02.17-Changing-headstack-Q...
        
         | williesleg wrote:
         | The outside of the disk spins faster than the inside of the
         | disk asshole.
        
         | Teknoman117 wrote:
         | I was about to call you out, but turns out the thought that WD
         | green drives were variable RPM was just people misinterpreting
         | crappy documentation from WD on how fast they operate. They
         | didn't publish speeds because they reserved the right to sell
         | any speed of drive under that label without telling you...
        
         | rsync wrote:
         | "No HDD on the market has ever been variable-speed."
         | 
         | I'm not sure that's true ...
         | 
         | I remember from one of the Sun Performance Tuning manuals there
         | is a chapter on disk performance and there was a throwaway line
         | about "non commodity disks". Specifically, that there were, in
         | the past, disk drives with heads that moved independently.
         | 
         | I don't know much more about "non commodity disks" - I think
         | they must have been prevalent in the 70s (?) - but variable
         | speed doesn't sound much weirder than independent head movement
         | ...
        
           | rzzzt wrote:
           | Dual head stacks were once a thing, and there was a recent
           | thread about them as well:
           | https://news.ycombinator.com/item?id=26502216
        
           | wmf wrote:
           | I've only heard about one model of dynamic RPM (DRPM) hard
           | disk and only prototypes were made.
        
           | userbinator wrote:
           | Multiple heads moving independently have many advantages,
           | varying the spindle speed doesn't; it takes a lot of energy
           | to change the speed of the platters, and doesn't really
           | provide any benefits. Zoned bit recording already takes
           | advantage of the varying linear density, and doesn't require
           | changing the platter speed.
           | 
           | In the 70s, disk drive motors would likely be mains powered
           | synchronous induction motors, which are constant speed and
           | where the traditional speeds of 3600, 5400, and 7200
           | originated.
        
           | bradknowles wrote:
           | Sony 3.5" floppy disks used to be variable speed. I knew
           | about that at the time I interviewed for a position as an
           | intern at Imprimis in 1989, just before they got bought by
           | Seagate. The guy I was interviewing with smiled smugly and
           | said "we've got a better solution -- we vary the speed at
           | which we read and write data!"
           | 
           | As a result of that job, I ended up writing what I believe to
           | be the first FAQ for hard drives and stiction -- in 1989. I
           | haven't been able to find a copy of it, however.
           | 
           | Dunno if there would be any useful information in that FAQ
           | for the OP, but it would be interesting to check it out and
           | see. That is, if anyone can find a copy.
        
       | gvb wrote:
       | _(When I was a sysadmin, I heard a story of how old VAX drives
       | would stall, so holes had been drilled in them with tape over the
       | holes. When stalled, the sysadmin would peel back the tape and
       | use their finger to spin-start them. Those even older drives must
       | have been more tolerant of dust!)_
       | 
       | More than once I had a hard drive fail to start up after a power
       | cycle (back then the drives only spun down when power was
       | removed). First thing we tried was to remove the drive and give
       | the whole drive a sharp spin on the axis of the platter. Due to
       | inertia of the platter, this would tend to get the platter to
       | move a bit and "unstick" it.
       | 
       | My recollection is that it worked every time I had to do this. Of
       | course, we would back up that drive and replace it as soon as
       | possible.
        
         | toss1 wrote:
         | Stiction. Good tip on how you were successfully eliminating the
         | stiction -- likely useful to remember in many such situations!
         | 
         | [1] https://en.wikipedia.org/wiki/Stiction
        
         | wazoox wrote:
         | It still works on relatively modern drives. If the spindle or
         | actuator arm is stuck, hitting the drive on its side (for
         | instance by hitting a table with it) can free stuck movable
         | parts. Worked well well into the 500GB era.
        
         | anyfoo wrote:
         | I went one step further: We had an old HP-UX machine with a
         | failed hard drive that wouldn't spin up. The data on it wasn't
         | supercritical but still nice to keep, so I was free to
         | experiment. I removed the housing and pushed the platter by
         | hand. It spun up, and while still open, I immediately took a
         | full disk image with dd.
         | 
         | A more similar story to yours was with my 120MB hard disk on my
         | PC when I was a, which inconsistently exhibited similar
         | symptoms. I had no money, so always had to do with what I had
         | (many stories sprang out of that). The hard disk was in a
         | removable caddy, and when it refused spinning up, I simply took
         | it out of the PC and gently bounced it on my bed right next to
         | the desk. Put back into the PC, it then worked every time as I
         | recall.
        
       | Neil44 wrote:
       | He found an old drive with the lid already removed, the drive was
       | potentially faulty beforehand then, hence why the lid was
       | removed. You would want to performance test a known good drive
       | before removing the lid to compare properly.
        
       | nix23 wrote:
       | Shout on it ;)
        
         | hs86 wrote:
         | For the uninitiated:
         | https://www.youtube.com/watch?v=tDacjrSCeq4
        
           | geerlingguy wrote:
           | Incidentally, this post was by the same person as that video
           | :)
        
           | kowlo wrote:
           | Thank you - that is the highlight of my week
        
       | gotbeans wrote:
       | He is packing house and the disk is only 80gb.
       | 
       | Kneejerk, half-"/jk" reaction when started reading was Brendan
       | was now all in with the chia fever.
        
       | bostonsre wrote:
       | I think his book on systems performance is the best computer
       | science book I've ever read.
        
         | john-tells-all wrote:
         | thanks for the reminder! I'll go buy it now, Gregg is always an
         | inspiration.
         | 
         | Here's the link => http://www.brendangregg.com/systems-
         | performance-2nd-edition-...
        
       | teddyh wrote:
       | > _couldn 't resist seeing if the disk was readable despite the
       | dust, and finding out what was on it (I'd forgotten)._
       | 
       | He proceeds to successfully read the disk, but doesn't say what
       | was on it.
        
         | lazide wrote:
         | I'm guessing he forgot to tell us, hah.
        
       | [deleted]
        
       | intc wrote:
       | Perhaps 6 - 7 years ago we provided some one U server for a
       | client. Initially the server worked fine. Then we upgraded it's
       | disks (originally 80 gigs or so) to 1TB each (in RAID1
       | configuration).
       | 
       | After the new disks were installed the server started to have
       | multitudes of issues with disk performance and random read /
       | write errors.
       | 
       | I think it took us several days to understand that the new
       | spinners where much more sensitive than the original ones - And
       | the large (very powerful) fan (located between the hdds) emitted
       | too much vibration for the disks to operate properly.
       | 
       | We ended up swapping the chassis to one with radial fans. Problem
       | solved.
        
       ___________________________________________________________________
       (page generated 2021-05-10 23:01 UTC)