[HN Gopher] Poor Disk Performance
___________________________________________________________________
Poor Disk Performance
Author : latch
Score : 176 points
Date : 2021-05-10 08:55 UTC (14 hours ago)
(HTM) web link (www.brendangregg.com)
(TXT) w3m dump (www.brendangregg.com)
| YourMeds wrote:
| Can we talk about how Microsoft made Windows 10 unusable on hard
| drives?
| h2odragon wrote:
| somewhere i have a stack of platters from a 5in SCSI drive with a
| circle in the middle where the heads crashed and peeled the
| coating off the disk.
| londons_explore wrote:
| Perhaps the dust particle causes the head to jump up and back
| down again, but the disruption to the data stream being read
| isn't sufficient to prevent ecc returning good data?
| kstrauser wrote:
| One time an acquaintance sold me a 50 MB SCSI drive filled with,
| umm, a selection of Amiga games (all PD, honest, officer!). When
| I got it home and installed, the drive howled like a banshee and
| a benchmark program said I was getting about 100KB/s reads from
| it. Figuring I had absolutely nothing to lose, I flipped it over
| and squirted a little 3-in-1 oil on the spindle bearing. The
| whine's pitch started increasing and quieting as the drive spun
| up to its full operating speed, and I watched the little graph
| slowly work its way up to a more reasonable 1MB/s. I made backups
| of the software on it then turned the system off, pulled the
| drive, and threw it away.
|
| I have never before or since oiled a piece of computer hardware
| to improve its IO, but this one time it worked.
| wiredfool wrote:
| I remember putting an Apple 20MB hard drive on a heater to warm
| up the lubricants so that it would spin up.
| flakiness wrote:
| I didn't expect Brendan Gregg talking about anything but Cloud
| anymore, but here here is! I appreciate his curiosity-chasing
| storytelling.
|
| I wonder how to "read over 99.9999% of disk sectors
| successfully". Is there any handy script to do this without harm?
| Then I can try these tools locally on my Ubuntu laptop to see the
| numbers.
| aidenn0 wrote:
| ddrescue is pretty good; it strides over the disk reading in
| big chunks, and writes out a map file of which chunks failed.
| Then it goes sector-by-sector over the failed chunks to fill-in
| the holes.
| flakiness wrote:
| It seems a scary option to try casually :-/
|
| > Never try to rescue a r/w mounted partition. The resulting
| copy may be useless. It is best that the device or partition
| to be rescued is not mounted at all, not even read-only.
|
| https://www.gnu.org/software/ddrescue/manual/ddrescue_manual.
| ..
| aidenn0 wrote:
| Yes, if you make a raw copy of a disk you are also
| concurrently writing to, you won't have a snapshot of the
| disk. Nothing surprising there, right?
|
| [edit]
|
| Also that section isn't talking about damaging the original
| disk, but rather ending up with a useless copy.
|
| If you have a malfunctioning device then using it in any
| way may cause further malfunctions, but in general running
| something like ddrescue isn't going to destroy something
| that wasn't already about to self-destruct anyways.
| CraigJPerry wrote:
| >> What's good for one user may be bad for another
|
| I bought 3 "identical" old Dell T3610 workstations off eBay for a
| home lab project. They have "identical" striped HDDs in them.
|
| Kicking off a Fedora Core OS install on all 3 simultaneously
| (which i've had to do a few times as i learned how ignition
| works) results in the same ordering of the hosts finishing their
| rebuilds -
|
| Machine 2 always finishes first by around 3 minutes of a 14 min
| wipe & reinstall process.
|
| Machine 3 finishes ahead of machine 1 by around 45 seconds or so
| usually.
|
| Almost 5 minutes in a <20 mins process, that's huge! I still
| don't actually know the root cause. Benchmarking disk I/O has
| them within a few percent of each other. It's not that they're
| contending to remotely load the OS installer - that gets cached
| in memory at the start. There's a few seconds difference in the
| UEFI bios startup timings but none of them are particularly
| consistent which is weird, i would have thought UEFI init time
| would be the same on a given host each boot, but there's a few
| seconds in it each time.
| userbinator wrote:
| _There 's a few seconds difference in the UEFI bios startup
| timings_
|
| Could that be anything to do with the Intel ME or similar
| "management" spyware/etc. trying to phone home or do something?
| You may have to reflash the BIOS to "clean" that completely.
|
| The other thing I can think of is that the CPU heatsinks aren't
| clogged with the same amount of dust and causing one machine to
| thermally throttle more than the others.
| xen2xen1 wrote:
| Thermals are probably the right answer.
| bityard wrote:
| I used to work at a web hosting company and one of my
| responsibilities was managing all the VPS backups. We had a few
| racks for "backup servers" which were all identical boxes. We
| would provision CentOS on them two or three at a time and I
| observed the exact same thing. Some servers were always just a
| little faster than others. Never figured out why.
| bentcorner wrote:
| That's interesting. If you're curious, swapping disks might
| tell you more. If it's thermals like other commenters have
| suggested then you won't see a difference. (unless you knock
| some loose!)
| userbinator wrote:
| No HDD on the market has ever been variable-speed. What you may
| be perceiving as a change in RPM could just be the difference in
| harmonics of the spindle motor as you change the dampening.
|
| As for particles on the platters, the keywords to search for are
| "thermal asperity". An excess of them, or ones that are actually
| stuck to the platter, can cause damage but as you have noticed,
| what usually happens is the head just knocks them out of the way
| and heats up slightly, causing a misread and subsequent retry
| (hence the slow speed). If you power on a working drive and then
| remove the lid, the air currents will keep any new particles from
| sticking to the pllatters.
|
| _By pushing down on the lid, however, (simulating screws) it
| sped up and down a few times before failing. The harder I pushed
| the less it vibrated and the more it worked, until I finally had
| it returning I /O, albeit slowly._
|
| It's more likely that you were adjusting the actuator angle. See
| the bottom picture in this article for comparison (also a WD
| drive from around the same era):
|
| https://hddguru.com/articles/2006.02.17-Changing-headstack-Q...
| williesleg wrote:
| The outside of the disk spins faster than the inside of the
| disk asshole.
| Teknoman117 wrote:
| I was about to call you out, but turns out the thought that WD
| green drives were variable RPM was just people misinterpreting
| crappy documentation from WD on how fast they operate. They
| didn't publish speeds because they reserved the right to sell
| any speed of drive under that label without telling you...
| rsync wrote:
| "No HDD on the market has ever been variable-speed."
|
| I'm not sure that's true ...
|
| I remember from one of the Sun Performance Tuning manuals there
| is a chapter on disk performance and there was a throwaway line
| about "non commodity disks". Specifically, that there were, in
| the past, disk drives with heads that moved independently.
|
| I don't know much more about "non commodity disks" - I think
| they must have been prevalent in the 70s (?) - but variable
| speed doesn't sound much weirder than independent head movement
| ...
| rzzzt wrote:
| Dual head stacks were once a thing, and there was a recent
| thread about them as well:
| https://news.ycombinator.com/item?id=26502216
| wmf wrote:
| I've only heard about one model of dynamic RPM (DRPM) hard
| disk and only prototypes were made.
| userbinator wrote:
| Multiple heads moving independently have many advantages,
| varying the spindle speed doesn't; it takes a lot of energy
| to change the speed of the platters, and doesn't really
| provide any benefits. Zoned bit recording already takes
| advantage of the varying linear density, and doesn't require
| changing the platter speed.
|
| In the 70s, disk drive motors would likely be mains powered
| synchronous induction motors, which are constant speed and
| where the traditional speeds of 3600, 5400, and 7200
| originated.
| bradknowles wrote:
| Sony 3.5" floppy disks used to be variable speed. I knew
| about that at the time I interviewed for a position as an
| intern at Imprimis in 1989, just before they got bought by
| Seagate. The guy I was interviewing with smiled smugly and
| said "we've got a better solution -- we vary the speed at
| which we read and write data!"
|
| As a result of that job, I ended up writing what I believe to
| be the first FAQ for hard drives and stiction -- in 1989. I
| haven't been able to find a copy of it, however.
|
| Dunno if there would be any useful information in that FAQ
| for the OP, but it would be interesting to check it out and
| see. That is, if anyone can find a copy.
| gvb wrote:
| _(When I was a sysadmin, I heard a story of how old VAX drives
| would stall, so holes had been drilled in them with tape over the
| holes. When stalled, the sysadmin would peel back the tape and
| use their finger to spin-start them. Those even older drives must
| have been more tolerant of dust!)_
|
| More than once I had a hard drive fail to start up after a power
| cycle (back then the drives only spun down when power was
| removed). First thing we tried was to remove the drive and give
| the whole drive a sharp spin on the axis of the platter. Due to
| inertia of the platter, this would tend to get the platter to
| move a bit and "unstick" it.
|
| My recollection is that it worked every time I had to do this. Of
| course, we would back up that drive and replace it as soon as
| possible.
| toss1 wrote:
| Stiction. Good tip on how you were successfully eliminating the
| stiction -- likely useful to remember in many such situations!
|
| [1] https://en.wikipedia.org/wiki/Stiction
| wazoox wrote:
| It still works on relatively modern drives. If the spindle or
| actuator arm is stuck, hitting the drive on its side (for
| instance by hitting a table with it) can free stuck movable
| parts. Worked well well into the 500GB era.
| anyfoo wrote:
| I went one step further: We had an old HP-UX machine with a
| failed hard drive that wouldn't spin up. The data on it wasn't
| supercritical but still nice to keep, so I was free to
| experiment. I removed the housing and pushed the platter by
| hand. It spun up, and while still open, I immediately took a
| full disk image with dd.
|
| A more similar story to yours was with my 120MB hard disk on my
| PC when I was a, which inconsistently exhibited similar
| symptoms. I had no money, so always had to do with what I had
| (many stories sprang out of that). The hard disk was in a
| removable caddy, and when it refused spinning up, I simply took
| it out of the PC and gently bounced it on my bed right next to
| the desk. Put back into the PC, it then worked every time as I
| recall.
| Neil44 wrote:
| He found an old drive with the lid already removed, the drive was
| potentially faulty beforehand then, hence why the lid was
| removed. You would want to performance test a known good drive
| before removing the lid to compare properly.
| nix23 wrote:
| Shout on it ;)
| hs86 wrote:
| For the uninitiated:
| https://www.youtube.com/watch?v=tDacjrSCeq4
| geerlingguy wrote:
| Incidentally, this post was by the same person as that video
| :)
| kowlo wrote:
| Thank you - that is the highlight of my week
| gotbeans wrote:
| He is packing house and the disk is only 80gb.
|
| Kneejerk, half-"/jk" reaction when started reading was Brendan
| was now all in with the chia fever.
| bostonsre wrote:
| I think his book on systems performance is the best computer
| science book I've ever read.
| john-tells-all wrote:
| thanks for the reminder! I'll go buy it now, Gregg is always an
| inspiration.
|
| Here's the link => http://www.brendangregg.com/systems-
| performance-2nd-edition-...
| teddyh wrote:
| > _couldn 't resist seeing if the disk was readable despite the
| dust, and finding out what was on it (I'd forgotten)._
|
| He proceeds to successfully read the disk, but doesn't say what
| was on it.
| lazide wrote:
| I'm guessing he forgot to tell us, hah.
| [deleted]
| intc wrote:
| Perhaps 6 - 7 years ago we provided some one U server for a
| client. Initially the server worked fine. Then we upgraded it's
| disks (originally 80 gigs or so) to 1TB each (in RAID1
| configuration).
|
| After the new disks were installed the server started to have
| multitudes of issues with disk performance and random read /
| write errors.
|
| I think it took us several days to understand that the new
| spinners where much more sensitive than the original ones - And
| the large (very powerful) fan (located between the hdds) emitted
| too much vibration for the disks to operate properly.
|
| We ended up swapping the chassis to one with radial fans. Problem
| solved.
___________________________________________________________________
(page generated 2021-05-10 23:01 UTC)