[HN Gopher] SSD death, tricky read-only filesystems, and systemd...
___________________________________________________________________
SSD death, tricky read-only filesystems, and systemd magic?
Author : ingve
Score : 23 points
Date : 2024-05-15 21:02 UTC (1 hours ago)
(HTM) web link (rachelbythebay.com)
(TXT) w3m dump (rachelbythebay.com)
| db48x wrote:
| Spooky.
| ajb wrote:
| journald also logs to /run/log/journal/ (a tmpfs) so until reboot
| you can still extract logs from it even if it didn't manage to
| write them to disk. See the 'storage' option in `man
| journald.conf`
| drycabinet wrote:
| That part of the man page sounds confusing. Is it unconditional
| and enabled by default?
| ciupicri wrote:
| WTF, it's 2024. systemd uses tmpfs for /tmp
| Arnavion wrote:
| It's the default, but TFA might've masked it since they're
| dealing with an old laptop where they can upgrade the drive but
| not the RAM.
| ciupicri wrote:
| You're right, but this was introduced around 2012 [1] when
| that old laptop was manufactured. Also no one is forcing you
| to store GBs of data on it and indeed in practice most
| programs use only a couple of MBs.
|
| [1]: https://fedoraproject.org/wiki/Features/tmp-on-tmpfs
| Arnavion wrote:
| >Even though I was using tar (and not dd, thank you very much)
|
| And from that linked article (
| https://rachelbythebay.com/w/2011/12/11/cloning/ ):
|
| >Even if you tell it to continue after read errors ("noerror"),
| you've just silently corrupted something on the target disk.
|
| The idea of using `dd` / `ddrescue` is to get the contents of the
| drive off it ASAP before it fails. You don't want to mount it.
| You don't even know if the filesystem on it is sane. The priority
| right now is to get all the bytes off that you can, then analyze
| it at your leisure to recover what you can. For the same reason
| you would also not `dd` directly to the new disk. You'd do it to
| a storage area where you can inspect and manipulate it first.
|
| And:
|
| >Finally, there is the whole matter of geometry. [...] Will the
| partition table wind up in the right place? What about the boot
| sector data and secondary stuff like the locations where your
| boot loader might be looking?
|
| ... is not a problem for the same reason. Once you have the disk
| image off the failing drive, you're welcome to inspect it and
| copy individual files off it to the target drive at the
| filesystem layer (rsync, cp, etc).
|
| And so:
|
| >Second, you're dealing with individual files. If your tar or
| rsync (or whatever) fails to read a file due to some disk issue,
| you'll know about it.
|
| This is the right idea. It's just not optimal to do it from the
| failing drive itself.
| Sakos wrote:
| My god, that article is actually filled with terrible advice. I
| hope they don't do this professionally. It's bad enough they're
| misinforming people and directing them to do things in a worse,
| more fraught failure-prone way.
|
| > First of all, 'dd' is going to try to read the entire drive,
| including the parts which currently do not contain any useful
| data. This happens because you have probably told it to read
| /dev/hdc. That refers to the entire readable portion of the
| disk, including the partition table and the parts which are not
| associated with living files.
|
| Make a sparse image with dd. Then you have a 1:1 copy while
| skipping all the unused sectors to save space. You can then
| mount the image read-only or write it to a new disk for
| recovery. That means that you have all the time in the world
| _and_ you can start over if need be without affecting the
| source data.
|
| Doing all this on a possibly actively failing drive,
| particulary an SSD which can literally die 100% into an
| unrecoverable state without access to a lab in the blink of an
| eye? That's insanity. And it's even worse SUGGESTING that to
| random people who don't know any better. You want to read
| everything off the SSD or HDD _once_. As fast as possible. As
| completely as possible.
|
| In fact, make sure you do a non-trim pass first, so you can be
| sure you have as much as possible before you try retrying
| sectors which might actually kill the drive dead.
|
| Once all the recoverable data is off the failing drive as an
| image, you're free to do whatever you want however you want
| without fear of any data loss caused by possible mistakes,
| oversights, hardware issues, etc.
| 1992spacemovie wrote:
| > My god, that article is actually filled with terrible
| advice. I hope they don't do this professionally.
|
| That's pretty par for Rachel by the bay. Her blog is
| basically just HN-endorsed ramblings at this point.
| derefr wrote:
| > You want to read everything off the SSD or HDD once.
|
| I mean, no, not really. You want to read everything off _at
| least_ once. Because a "corrupt sector" is actually often in
| a _non-deterministic_ state, reading differently with each
| read -- but that state may still be floating just far enough
| toward logical 0 or 1, that you _can_ get a bit-pattern out
| of it through _statistical analysis_ , over multiple reads.
|
| You _do_ want to capture the whole disk once _first_ , in
| case the disk spindle motor is about to die. (This isn't
| _usually_ the problem with a dying disk... save for certain
| old known-faulty disk models -- "DeathStar" and so forth.
| But it's good to be safe.)
|
| But after that, you want to read it again, and again, and
| again. And save the outputs of those reads. And then throw
| them into a tool like ddrescue that will put the results
| together.
|
| Basically, it's the same principle that applies to archival
| preservation of floppy disks (see e.g.
| https://www.youtube.com/watch?v=UxsRpMdmlGo). Magnetic flux
| is magnetic flux! (And even NAND cells can end up with
| approximately the same "weak bits" problems.)
|
| ---
|
| Mind you, it'd be even better if we could get several
| _analogue_ dumps of the disk, and average _those_ -- like in
| the video above.
|
| Sadly, unlike with floppy preservation, it'd be very
| difficult to get an analogue dump of an HDD or SSD. With
| floppies, the flux is right there for the reading; you just
| need a special drive. HDDs and SSDs, meanwhile, are far more
| self-contained. For an HDD, there might be a path to doing
| it, at least on a disk-specific basis... it would likely
| involve tapping some signal lines between the disk controller
| and the read head. But for an SSD, I don't think it would be
| possible -- IIRC with most NAND/NOR flash packages, the
| signal is already digitized by the time it comes off the
| chip.
| djbusby wrote:
| I generally dd to an image file. (dd of=/data/bad-drive.dump)
| Then try tools to inspect fix the partition and FS, mount that
| and then rsync.
|
| That's what you're saying too, right?
| Arnavion wrote:
| Yes, exactly.
| MadnessASAP wrote:
| Nevermind trying to copy off of a failed drive that's still
| in use.
|
| Madness! Madness I say!
| bravetraveler wrote:
| I know they mean well, but if you have a filesystem you want to
| preserve on questionable media... don't interact with it until
| you've made your copy.
|
| With journaling and the like... mounting read only can induce
| more writes than you expect. Given you're copying, that's
| probably not what you want. Consistency is key.
|
| Just use _" dd"_, _" ddrescue"_, or simply _pv /cat/shell_ in+out
| redirects _(some limitations apply)_ to evacuate it first. Then
| you have infinite tries.
|
| Stop thinking in terms of devices. You can write this drive to a
| file. You can write files to drives. It's all made up, yet it
| works. Magic.
|
| By skipping past the block level to the filesystem, one is
| limited in what they can recover. You're at the mercy of it being
| more sane than truly necessary.
|
| Perhaps they get into this, but wow. I know I'm not The One, and
| I hate to pretend to be, but this is day one data preservation
| stuff.
|
| I had to post when I got to _" LOL it went RO, it's probably
| fine"_... when no, the kernel interjecting to protect the data is
| not a reasonable first line of defense. Now back to reading.
___________________________________________________________________
(page generated 2024-05-15 23:00 UTC)