[HN Gopher] Iron Mountain: It's Time to Talk About Hard Drives
___________________________________________________________________
Iron Mountain: It's Time to Talk About Hard Drives
Author : severine
Score : 150 points
Date : 2024-09-10 19:04 UTC (1 days ago)
(HTM) web link (www.mixonline.com)
(TXT) w3m dump (www.mixonline.com)
| alchemist1e9 wrote:
| Does it mean LTO tape for the win then?
|
| We're about to start a project to build an LTO-9 based in-house
| backup system. Any suggestions for DIY Linux based operation
| doing it "correctly" would be appreciated. Preliminary planning
| is to have one drive system on in our primary data center and
| another offsite at an office center where tapes are verified
| before storage in locked fireproof storage cabinet. Tips on good
| small business suppliers and gear models would be great help.
| antisthenes wrote:
| What's your budget?
| alchemist1e9 wrote:
| $50K if needed but it doesn't look to need that. 2PB initial
| data. predicted 1PB/year with around 10-30%/year rate of
| growth of rate of growth (acceleration?)
| akira2501 wrote:
| > fireproof storage cabinet.
|
| Nothing is fire proof. Is the cabinet "fire suppression system
| liquid" proof?
|
| > Tips on good small business suppliers and gear models would
| be great help.
|
| Hire an auditor would be my advice. Every business is
| different.
|
| I am, just now, having flashbacks of when I was in a SOX
| environment and had to regularly contract with them... and
| while the experience can be somewhat unpleasant I've often
| found good auditors to be extremely knowledgeable about
| solutions and their practical implementation considerations.
| hinkley wrote:
| Pretty sure the pyramids at Giza count as fireproof boxes.
| But that's out of most people's price range.
| burnished wrote:
| Not if the fire is big enough.
| hinkley wrote:
| When a catastrophic event is catastrophic enough, your
| other problems cease to be your problem.
| akira2501 wrote:
| They're not fully sealed. There are two shafts which
| connect the King's chamber to the exterior of the pyramid.
| The lower ritual congregation area is not fully sealed off
| from the upper chambers either. Which means bats are a
| constant problem in pyramids.
|
| You could easily get a fire going inside one.
| hinkley wrote:
| The advice I got long ago from an IT guy was: if you wait long
| enough, tape will be on top again.
|
| That was a long time ago but I've peeked in at backup systems
| in the intervening years and it does seem to hold true over
| time.
|
| But it really depends how much data you have. My ex dropped a
| single HDD in a safety deposit box at CoB, N times per week and
| fetched back the oldest disk. I don't think she ever said how
| many were in there but I doubt it was more than three. I think
| the CTO took one home with him once per week.
|
| The silly thing about most of this set up is that the office,
| the bank, and the data center were all within half a kilometer
| of each other. If something bad happened to that part of town
| they only had the infrequent offsite backup.
| katbyte wrote:
| 20tb and under is easy. 500tb is hard
| exe34 wrote:
| the sort of bad that would render all three unreadable would
| probably melt the rest of the infrastructure the company
| relies on for business anyway.
| marcosdumay wrote:
| Every time I've looked, tapes were on top "again" for large
| scale archival. And I've been looking for ~20 years by now.
|
| I don't get where people get the impression that X was at the
| top right before tapes got that last innovation (where X here
| is most often HDDs, but not always). But that's always the
| impression, and tapes are always on top.
|
| People also have been working with 3D phase change drives
| since the 90s. Those always promise to replace tapes. But
| nobody ever got them robust enough to leave the labs.
| WaitWaitWha wrote:
| Q: Why not archival M-Disc?
| actionfromafar wrote:
| A: Are there any "real" M-Discs for purchase anymore? (I.e.
| not just rebranded regular dye-discs.)
| netrap wrote:
| Possibly in DVDR.
| alchemist1e9 wrote:
| Amount of data makes it less realistic. We have around 2PB
| data currently and expect to grow around 1PB next year with
| maybe 10-30% annual growth rate.
| bluedino wrote:
| Tapes are fun. You can fit a petabyte of data in a bankers box!
|
| The problem quickly becomes:
|
| - Do we have a drive that can read this tape?
|
| - Do we have server we can connect it to?
|
| - Do we have storage we can extract it to? (go ask your
| internal IT team for 10TB of drivespace...)
|
| - What program did we create this tape with? Backup Exec,
| Veritas, ArcServe, SureStore
|
| - You have the encryption keys, right?
|
| - How much of this data already exists on the previous months
| backup?
|
| - Who's going to pay for the storage to move it to Glacier/etc?
|
| -How long is it going to take to upload?
| aftbit wrote:
| If you are having trouble getting 10TB of disk space from IT,
| you have bad IT. Not saying that's uncommon or anything, but
| 10TB fits on one external hard drive for $300 from Best Buy,
| or less than $1000/mo using EBS on AWS if you need some
| better guarantees and are all-in on the cloud.
| DaiPlusPlus wrote:
| > Tapes are fun. You can fit a petabyte of data in a bankers
| box!
|
| Yes, though those ultra-thin M.2 NVMe drives could probably
| top that now.
|
| > Do we have a drive that can read this tape?
|
| Don't let this be a problem in the first place: buy 4 tape
| drives and keep 2 of them in your cold/offsite/airgapped
| storage site (2 in case 1 fails, so you can use the remaining
| drive to transfer everything to a newer format).
|
| > Do we have server we can connect it to?
|
| Significant hardware should not (and is not) necessary: the
| LTO-7/8/9 drives I see for sale right now seem to be using
| either USB 3.0, Thunderbolt, or SAS connections; USB and
| Thunderbolt can be handled by any computer you can find at a
| PC recycler today; while any old desktop can handle SAS with
| a $80 HBA card.
|
| > Do we have storage we can extract it to? (go ask your
| internal IT team for 10TB of drivespace...)
|
| 10TB isn't a good example (case-in-point: I have a 3-year-old
| stack of unused 12TB WD drives less than 3 feet away from
| me).
|
| That said, if you're enterprise-y enough for a $4000 LTO-9
| drive, then you probably also have a SAN that's chock full of
| drives, so being able to provision a 10TB+ LUN should be
| implicit.
|
| > What program did we create this tape with? Backup Exec,
| Veritas, ArcServe, SureStore
|
| Ideally, none of those; instead, good ol' Perl and `dd`.
|
| > You have the encryption keys, right?
|
| I don't encrypt my backups to avoid this problem. My old data
| archives have little exploitable value for any potential
| attacker; and I imagine I'd store backup tapes this important
| in a fireproof safe in my parents' house or something. I'd
| only encrypt the entire tape if the tape were to leave my
| custody.
|
| I appreciate that this is not for everyone, and it's probably
| illegal for some people/orgs to _not_ encrypt backups too
| anyway (HIPAA, etc).
|
| > How much of this data already exists on the previous months
| backup?
|
| Incremental/Differential backups are still a thing.
|
| > Who's going to pay for the storage to move it to
| Glacier/etc?
|
| No-one should. Cold data backups should/must always be in the
| custody of a designated responsible officer of the company.
|
| BUUUT, I guess there's nothing wrong with storing an
| encrypted copy in Cloud storage (as in S3/Glacier/AzureBlobs
| - not OneDrive...). I actually do this right now thanks to
| the smooth and painless integration in my Synology NAS. It
| costs me about $15/mo to store all these TBs in S3.
|
| > How long is it going to take to upload?
|
| Consider that it's 2024 - a company with LTO-9 and SAN is
| probably going to have a metro-ethernet IP connection at
| 10Gbps or even faster. At home I have a 10Gbps symmetric
| connection from Ziply (it's $300/mo and they give you an SFP
| module, which I put into my Ubiquiti UDM): so the limiting
| factor here is not my upload speed, but my drive read speed
| (LTO-9 drives seem to read at about 2-3Gbps
| raw/uncompressed?)
| gosub100 wrote:
| Make sure the bandwidth exists to keep up with the write speed
| of the LTO drive. For instance, the write speed for LTO-6
| (which I own as a hobbyist) is around 300MB/s, but line speed
| of gigabit Ethernet is about 100MB/s. Translate those numbers
| to LTO-9 and make sure that the NAS, network, or local storage
| can keep up. It's not a deal-breaker to underflow the drive,
| but it causes the tape to stop, rewind, and re-buffer (called
| shoe-shining) which takes more time and causes unnecessary wear
| on the drive and cartridges.
| adrian_b wrote:
| LTO-9 tapes can be easily found on Amazon in many countries,
| made by IBM, HP, Quantum or Fuji.
|
| The vendor does not matter, whichever happens to be cheaper at
| the moment is fine.
|
| For the tape drives, the internal drives can be cheaper by
| around 10%, but I prefer the tabletop drives, because they are
| less prone to accumulate dust, especially if you switch them on
| only when doing a backup or a retrieval. The tape drives have
| usually very noisy fans, because they are expected to be used
| in isolated server rooms.
|
| I believe that the cheapest tape drives from a reputable
| manufacturer are those from Quantum. I have been using a
| Quantum LTO-7 tape drive for about 7 or 8 years and I have been
| content with it. Looking now at the prices, it should be
| possible to find a tabletop LTO-9 drive for no more than $5000.
| Unfortunately, the prices for tape drives have been increasing.
| When I have bought an LTO-7 tabletop drive many years ago it
| was only slightly more than $3000.
|
| The tapes are much cheaper and much more reliable than hard
| disks, but because of the very expensive tape drive you need to
| store a few hundred TB to begin to save money over hard disks.
| You should normally make at least two copies of any tape that
| is intended for long-term archiving (to be stored in different
| places), which will shorten the time until reaching the
| threshold of breaking even with HDDs.
|
| Even if there are applications that simulate the existence of a
| file system on a tape, which can be used even by a naive user
| to just copy files on a tape, like copying files between disks,
| they are quite slow and inefficient in comparison to just using
| raw tape commands with the traditional UNIX utility "mt".
|
| It is possible to write some very simple scripts that use "mt"
| and which allow the appending of a number of files to a tape or
| the reading of a number of consecutive files from a tape,
| starting from the nth file since the beginning of a tape. So if
| you are using only raw "mt" commands, you can identify the
| archived files only by their ordinal number since the beginning
| of the tape.
|
| This is enough for me, because I prepare the files for backup
| by copying them in some directory, making an index of that
| directory, then compressing it and encrypting it. I send to the
| tape only encrypted and compressed archive files, so I disable
| the internal compression of the tape drive, which would be
| useless.
|
| I store the information about the content of the archives
| stored on tapes (which includes all relevant file metadata for
| each file contained in the compressed archives, including file
| name, path name, file length, modification time, a hash of the
| file content) in a database. Whenever I need archived data, I
| search the database, to determine that it can be found, for
| instance in tape 63, file 102. Then I can insert the
| corresponding cartridge in the drive and I give the command to
| retrieve file 102.
|
| I consider much better the utility "mt" of FreeBSD than that of
| Linux. The Linux magnetic drive utilities have seen little
| maintenance for many years.
|
| Because of that, when I make backups or retrievals they go to a
| server that runs FreeBSD, on which the SAS HBA card is
| installed. When a tabletop drive is used, the SAS HBA card must
| have external SAS connectors, to allow the use of an
| appropriate cable. I actually reboot that server into FreeBSD
| for doing backups or retrievals, which is easy because I boot
| it from Ethernet with PXE, so I can select remotely what OS to
| be booted. One could also use a FreeBSD VM on a Linux server,
| with pass-through of the SAS HBA card, but I have not tried to
| do this.
|
| My servers are connected with 10 Gb/s Ethernet links, which
| does not differ much from the SAS speed, so they do not slow
| much the backup/retrieval speed. I transfer the archive files
| with rsync over ssh. On slow computers and internal networks
| one can use rsync without ssh. I give the commands for the tape
| drive from the computer that is backed up, as one line commands
| executed remotely by ssh.
|
| The archive that is transferred is stored in a RAMdisk before
| being written on the tape, to ensure that the tape is written
| at the maximum speed. I write to the tape archive files that
| have usually a size of up to about 60 GB (I split any files
| bigger than that; e.g. there are BluRay movies of up to 100
| GB). The server has a memory of 128 GB, so I can configure on
| it a RAMDdisk of up to 80 GB without problems. This method can
| be used even with a slow 1 Gb/s or 2.5 Gb/s network, but then
| uploading a file through Ethernet would take much more time
| than writing or reading the tape.
|
| There is one weird feature of the raw "mt" commands, which is
| poorly documented, so it took me some time to discover it,
| during which I have wasted some tape space.
|
| When you append files to a partially written tape, you first
| give a command to go to the end of the written part of the
| tape. However, you must not start writing, because the head is
| not positioned correctly. You must go 2 file marks backwards,
| then 1 file mark forwards. Only then is the head positioned
| correctly and you can write the next archived file. Otherwise
| there would be 1 empty file intercalated at each point where
| you have finished appending a number of files and then you have
| rewound the tape and then you have appended again other files
| at the end.
| netrap wrote:
| Only Sony or Fuji actually make tapes. The rest are
| rebranded.
| adrian_b wrote:
| True, but the rebranded tapes are frequently cheaper than
| Fuji or Sony.
| alchemist1e9 wrote:
| A lot very interesting details in your reply - thanks. I have
| this question:
|
| If you aren't budget constrained today and had to set it all
| up again. What would you do?
|
| While I'm a Linux guy, I'll happily run BSDs when
| appropriate, like for pfSense, and if it really has better mt
| tools or driver for LTO-9 drives due to the
| culture/contributors being more old school, then I'd just
| grab a 1U server to dedicate for it run a BSD and attach the
| drive to that.
|
| You seem to have extensive practical hands on experience and
| while I was doing tapes 20 years ago this will be first time
| I'm hands on again with it since then. So I need to research
| most reliable drive vendors and state of kernel drivers and
| tools, just as you are alluding to.
|
| Pretend you have $50K if needed (doubt it). 2PB existing
| data, 1PB/year targeted rate, probably 10-20%/year
| acceleration on that rate. with a data center rack location,
| 20Gb/s interconnect via bonded 10Gb NICs to storage servers
| (45drives storinators) and then an office center
| cabinet/rack/desk (your choice) and will put a tape drive
| holding at least 8 tapes in data center, planning for worst
| case of 100TB a month and data center visits to swap in new
| tapes shouldn't be too frequent. Any details on what you
| would do would be interesting.
| adrian_b wrote:
| Like I have said, it is not necessary to dedicate a full-
| time FreeBSD server for this, you can use either a Linux
| server that is rebooted temporarily in FreeBSD or a FreeBSD
| virtual machine on the Linux server.
|
| Around $5000 to $6000 should be enough for a LTO-9 tabletop
| tape drive plus a suitable SAS HBA card and SAS cable. The
| card must have matching SAS connectors and SAS speed with
| the tape drive.
|
| More money will not bring anything extra until a much
| higher amount is reached, which would be enough to buy a
| tape autoloader/library, which would eliminate the
| necessity for a human to insert and remove the cartridges
| into the tape drive when needed. I am not sure if $50K is
| enough for a tape autoloader.
|
| Tape autoloaders/libraries are worthwhile only for very big
| organizations where the amount of data that is continuously
| written or read to or from the tapes is very large. For a
| small business or for an individual a tape autoloader is
| certainly not worthwhile, because the tape drive will be in
| use at most a small fraction of every day.
|
| 1 PB/year is less than 3 TB/day. This can be written on a
| single tape in a little more than 2 hours. Even with a
| simple non-pipelined implementation of the file uploading
| with the writing on the tape, the backup can be done in
| less than 4 hours. Even writing 2 copies can be done in
| less than 8 hours. The backup can be done mostly or
| completely overnight.
|
| For a much bigger amount of data one could buy several tape
| drives, before starting to think about an autoloader. Also
| it is possible to pipeline the network transfers with the
| tape writing, for a backup speed higher by around 50%.
|
| If money would not be a problem and if the data needs to be
| archived for a long term, so that multiple copies are
| desirable, I would buy 2 tape drives, to be able to write 2
| copies simultaneously.
|
| This would also halve the time for archiving the initial 2
| PB of existing data, which will take several months, so a
| speed-up would be desirable. Having 2 drives will also
| increase the reliability, as the system will continue to
| work if one becomes defective.
|
| With only 3 TB written per day, a LTO-9 tape, which has a
| capacity of 18 TB, will be enough for 6 days.
|
| So unless a backup must be restored, the operator would
| need to change the tape only once per week.
|
| This is a moderate amount of data, easy to handle with a
| single drive, even if two are preferable for redundancy and
| for higher speed.
|
| I do not understand your reference to a "a tape drive
| holding at least 8 tapes in data center". If you mean an
| autoloader, from what you describe it does not seem that
| the very big expense for an autoloader would be justified.
|
| The LTO tapes are best stored in suitcases that can contain
| 20 cartridges, i.e. when using LTO-9 that is 360 TB.
| Therefore 3 suitcases store more than 1 PB, i.e. a year of
| data according to your example. The suitcases should be
| stored in a secure safe or cabinet. They are usually made
| to be stackable.
|
| I have assumed that your 1 PB is of already compressed
| data. If the data is compressible than the requirements for
| the usage time of the drives and for the storage volume
| would be much smaller.
|
| I have forgotten to mention that after I compress and
| encrypt the archived files, I add redundancy with a Reed-
| Solomon code, e.g. with the par2 program. If I choose e.g.
| a redundancy of 5%, then a file retrieved from the magnetic
| tape could have defects of up to 5% of its size, while the
| original data could still be extracted from it.
| alchemist1e9 wrote:
| Excellent help. To clarify a few items: - yes I mean
| drives with autoloader. for example:
| https://www.backupworks.com/qualstar-Q24-LTO-9-SAS-
| Library.a... it's basically a hard requirement as we
| don't have staff time to enter data centers frequently.
| we are a bit unusual in being certainly not big, but not
| really a small business either when looking at budgets
| available. unless there is something wrong with qualstar
| product linked above perhaps autoloaders are cheaper than
| you believed?
|
| - understood your rebooting trick. however being full
| automated (apart from blank tape rotations) is a
| requirement also. it's a production infrastructure. if
| FreeBSD provides significant value it seems safer to spec
| a dedicated 1U server to use for backups. there is a
| management node currently that might work though that has
| to run Linux as it currently does and I need to check if
| the SAS on it can be used. It has an bunch of SAS ssd
| drives currently and I would have assumed there is a way
| to cable up the qualstar drive ... but again I'm still
| early in researching. and the SAS compatibility issue you
| raise is perfect example of stuff I need to figure out.
|
| - love par2cmdline and our burner with mdisc for IP
| backup uses that on git repo files and then seqbox as an
| outer container for data to guard against potential fs
| metadata corruption issues. there was a newer low level
| tool (rust rewrite I think) with many bitrot protection
| features that I can't recall it's name currently and
| isn't immediately coming up in my notes, but I know it
| exists and have been meaning to look into it. it has a
| newer erasure encoding like raptorq and also block
| metadata like seqbox, I think can replace the par2 seqbox
| combo we are currently using on MDISC physical backup for
| IP. I don't trust a 100% cloud as one can imagine somehow
| getting all accounts hacked and deleted.
|
| - yes on compressed. the 2PB is already highly highly
| compressed. so it means 18TB/tape.
|
| Do you have any vendor/distributors you can recommend? I
| always recommend 45drives to people and I was planning to
| ask them about LTO when we order next storinator which is
| coming up soon also.
|
| There is this interesting blog post from a couple of
| years ago that probably was the seed of my plan to embark
| on LTO. Our monthly backblaze invoice is totally out of
| control. But we need a full backup of our data as it's
| simply not replaceable and at the heart of the business.
|
| https://blog.benjojo.co.uk/post/lto-tape-backups-for-
| linux-n...
|
| they talked me into it. not out of it given our specific
| situation.
| adrian_b wrote:
| That is indeed a cheap autoloader.
|
| If you would use the full configuration with 2 tape
| drives, the cost of the system might be around $15k,
| which is very reasonable for a tape library with
| autoloader.
|
| I think that this autoloader is a good choice, especially
| if the price includes "1 x IBM LTO-9 SAS Tape Drive
| Installed".
|
| As I have said, I believe that it is better to choose the
| option of also including the second tape drive.
|
| For the tapes, there is no reason to worry about specific
| distributors. I have always bought them from Amazon, but
| shops that are specialized in storage products should be
| OK, unless they charge a premium price over what can be
| found at Amazon or Newegg. While the tapes are made by
| Fuji or Sony, they are usually easier to find and at at
| lower prices as IBM, HP or Quantum branded tapes.
|
| The prices vary, so whichever vendor is cheaper when you
| buy a batch of tapes should be fine. An LTO-9 cartridge
| should be only slightly over $100. In time the prices of
| LTO-9 cartridges should drop. For now they are more
| expensive than the older cartridges, because they are
| still relatively new.
|
| I store the tapes in Turtle cases:
|
| https://turtlecase.com/collections/lto
|
| You must check the tape drive requirements for the SAS
| HBA PCIe card that must be installed in the server, which
| must have compatible connectors, and you must buy an
| appropriate SAS cable. I believe that the LTO-9 drives
| require the newer 12 Gb/s SAS standard and also the newer
| variant of the external SAS connectors (perhaps SAS HD
| SFF-8644 connectors).
|
| If you already have a 12 Gb/s SAS HBA that has only
| internal connectors for SSDs, it is possible to reuse it
| by buying a SAS internal to external adapter of the
| appropriate connector types, which must occupy one of the
| empty expansion slots of the server case and which plugs
| into the internal connectors, while providing external
| connectors. Such adapters can also be used with server
| motherboards that have on-board SAS controllers. If you
| have a SAS HBA card that has external connectors, but
| different from those on the tape drive, e.g. SAS
| SFF-8088, there are cables with mixed SAS connectors that
| can connect the tape drives. The HBA cards usually have
| at least 2 external SAS connectors, suitable for 2 tape
| drives.
|
| With the autoloader, it should be easy to make the backup
| or retrieval process completely automatic, so that an
| operator should not have to visit the tape autoloader
| more often than at a few months interval, except for the
| initial phase when you would have to write 2 PB on almost
| 120 tapes (or a double number for improved redundancy,
| beyond the redundancy added per each archive file; 2
| copies can be stored in 2 different geographic locations,
| to avoid the catastrophic loss of all tapes), so you
| would want to keep the tape autoloader in an easily
| accessible place for that time.
|
| The initial cost for writing 2 copies of 2 PB of data,
| i.e. 4 PB of data, would be not much less than $30k for
| the tapes. This, together with the autoloader with 2 tape
| drives, HBA card, cases, cables and maybe adapters, would
| be in the range of $45k to $50k, so within your estimated
| budget.
|
| As I have said, it is convenient to have a database with
| the metadata (including content hashes, made e.g. with
| BLAKE2b-512 or with BLAKE3-256) of all the files that
| have ever been archived, which shall be used whenever
| information must be retrieved and which can also be used
| for deduplication (for which the content hashes are
| handy), to check whether a file is already present in
| some earlier archive, so there is no need for its backup.
| adrian_b wrote:
| I want to add that when you start testing the tape
| drives, one of the first things that you need to do is to
| measure the exact capacity of an 18 TB LTO-9 tape
| cartridge.
|
| For instance, I write the tapes with "dd bs=131072
| if="$file_name" of=/dev/nsa0". This means that I am using
| 128 kB blocks. I have measured that a 6 TB LTO-7 tape
| cartridge has a capacity of 45905860 such 128 kB blocks.
|
| The position of the read/write head, measured in blocks
| from the beginning of the tape, can be obtained with "mt
| rdspos". After you choose some block size, e.g. 128 kB,
| you should forever stick with it in all your write
| commands and on all your tapes, so that you will always
| get consistent information about the position of the
| read/write head.
|
| The tape capacity can be measured by writing files,
| preferably of the same size that you will typically use
| for archives (in order to write a similar number of file
| marks), until you get a write error.
|
| With the capacity of the tape known exactly, after any
| writing of a new file you get the current position and
| you compute the remaining free space on the tape, to know
| whether you can still append data or you must change the
| tape.
|
| The position in blocks can also be used to verify that
| the tape drive works OK. For example when after rewinding
| the tape you go to the end of the written part, to append
| new files, you must see the same position as after your
| last write. Or when writing a copy of a tape, you must
| see the same positions on both tapes for any file.
|
| For retrieving files, the position in blocks does not
| matter, but only the ordinal number of a file. You
| position the read/write head to the beginning of a file
| with "mt rewind; mt fsf $file_number". Then you read the
| file, possibly in a loop if you want to read multiple
| consecutive files.
|
| For going to the end of the written part of a tape, to
| append new files, you must use "mt locate -e; mt bsf 2;
| mt fsf", as I have mentioned in a previous posting. The
| explanation of why this is needed is buried in the
| documentation about how tape marks and head positioning
| really work.
|
| Whenever I start using the tape drive, I use "mt comp
| off; mt status" and I check the status output to be as
| expected.
|
| The tape is ejected with "mt -f /dev/esa0 rewind".
| alchemist1e9 wrote:
| I really appreciate all this information you have given.
| It's extremely useful for me and details I understand and
| can use.
|
| https://www.cdw.com/product/quantum-superloader-3-with-
| model...
|
| Perhaps I should just get equipment from CDW and do my
| own research.
|
| What I wish I had was a vendor who knows this stuff well
| and had pre-tested Linux/FreeBSD configurations.
| gosub100 wrote:
| Great post. You might be able to elide the RAM disk in lieu
| of the "mbuffer" command. My script uses a combination of dd
| | pv | mbuffer | mt. I omitted the options because I don't
| remember any of them. I personally use dd of an ext4
| filesystem-on-file that is exactly the size of what will fit
| on tape. This was simply because I couldn't figure out how to
| reliably advance the tape head or how to continue a write
| from one tape to another.
| wazoox wrote:
| Regarding Linux' "mt", there are two versions : the horrible,
| primitive version that comes with cpio and is almost
| certainly the one that's installed as default : and "mt-st",
| the actually usable one.
| wazoox wrote:
| Archival is more of a process than only a question of media.
| First you must create a proper database of your archived data.
| Maybe you want to do 3 copies, not two. Maybe you want to use
| two different archival formats such as tar and LTFS, just in
| case. Maybe you want to source your media from both available
| producers (Sony and Fuji) because in the long run, maybe one or
| the other may grow some funky error mode or corruption problem.
| Etc.
|
| Also check my tape management primer:
| https://blogs.intellique.com/tech/2022/01/27#TapeCLI
| antisthenes wrote:
| What's the scenario where you cannot take the old 1990's hard
| drive and back up its data in multiple cloud service providers
| cold storage (Azure/AWS/GCP) and have to keep the obsolete
| physical media on hand?
|
| I'm struggling to understand why these miles of shelves filled
| with essentially hardware junk haven't been digitized at the time
| when this media worked and didn't experience read issues.
|
| The article doesn't really provide an explanation for this other
| than incompetence and the business biting off more than it can
| reasonably chew. I'd be furious if I paid for a service that
| promised to archive my data, and 10-15 years later told me 25% of
| it was unreadable. I mean it's not like it was a surprise either.
| These workflows became digital 2-3 decades ago. There was plenty
| of time to prepare and convert this.
|
| That's kind of what I'm paying you for.
|
| As always, seems like the simple folk of /r/datahoarder and other
| archivist communities are more competent than a legacy industry
| behemoth.
| akira2501 wrote:
| > the business biting off more than it can reasonably chew
|
| It's hoarding behavior. They paid "a lot" of money for it, have
| no idea how to further exploit it, but can't shake the feeling
| that it might be massively valuable one day.
|
| The only difference is they pay someone to hold their hoard for
| them.
| eitally wrote:
| Alternatively ... they are forced to maintain it for
| compliance reasons, especially when it comes to healthcare,
| finance and other regulated industries (defense, in
| particular). This even applies to manufacturing companies
| assembly medical devices and defense products: all the data
| about the supply chain, the engineering designs & changes,
| the manufacturing and quality testing, and shipment needs to
| be kept for XX years and is subject to both audit by
| regulatory agencies and to legal discovery.
| akira2501 wrote:
| Those are all things which can be printed out and stored in
| alternative forms and possibly even recreated from other
| data. It's also the case that much of that data will never
| be permanently at rest and so several archive copies of the
| data exist.
|
| Recordings of performances are an entirely different
| category of artefact.
| surgical_fire wrote:
| I mean, even if by contract they were supposed to store
| physical media with the backups, it is still horrible
| incompetence to not have the same data backed up twice, and
| from time to time test the disks for failure to rebuild the
| backup from one of the copies.
|
| It would be extremely unlikely for both disks to fail together.
|
| What I'm describing is the bare minimum. This is their job, by
| all accounts. Amazing.
| 0cf8612b2e1e wrote:
| It depends on what specifically Iron Mountain is selling you. A
| place to store your physical data device or are they promising
| to keep your data available? The former sounds cheaper and
| easier for Iron Mountain. Given Iron Mountain started in the
| 1950s, redundantly backing up customer data was infeasibly
| expensive for most of the company's lifetime.
| Cheer2171 wrote:
| > I'd be furious if I paid for a service that promised to
| archive my data, and 10-15 years later told me 25% of it was
| unreadable.
|
| The article is very vague on this, but I thought this company
| was first doing something like a bank safety deposit box. Send
| us your media in whatever format and we will keep it secure in
| a climate controlled vault. They don't offer to archive your
| data, they offer to store your media. Now it seems they pivoted
| to archiving data. This is an ad for their existing media
| storage clients to buy their data archive service:
|
| > Iron Mountain would like to alert the music industry at large
| to the fact that, even though you may have followed recommended
| best practices at the time, those archived drives may now be no
| more easily playable than a 40-year-old reel of Ampex 456 tape.
| eitally wrote:
| They did make this pivot several years ago, with big upsell
| and a huge internal product advancement to offer housebuilt
| eDiscovery for lots of data types. I was at Google Cloud when
| they did the first big deal around this a few years ago:
| https://www.ironmountain.com/resources/blogs-and-
| articles/f/...
| tecleandor wrote:
| Also where you don't render the tracks pre and post processing
| and leave them aside to the ProTools project files. I don't
| know who expects to open a ProTools project with a bunch of
| unknown plugins after some years have passed...
| andrewf wrote:
| Back in the 2000s the Australian government provided software
| that ran on Windows to prepare and submit your personal tax
| return. I used to archive my tax return by preparing it in a
| Windows VM, then storing the whole VM image.
| tecleandor wrote:
| Spanish tax filing software was terrible back in the day.
| The good part is you would just print it to paper or PDF
| and archive that :D
| chuckadams wrote:
| > I'm struggling to understand why these miles of shelves
| filled with essentially hardware junk haven't been digitized at
| the time when this media worked and didn't experience read
| issues.
|
| Because it's time consuming and expensive and the format you
| digitize it into is also in danger of decaying into oblivion.
| See also: TFA.
| bob1029 wrote:
| Iron mountain also provides services like source code escrow.
|
| With 2 parties involved in the data, you may want to impose
| additional restrictions regarding how and when it can be
| replicated. The party requesting escrow clearly has interest in
| the source being as durable as possible, but the party
| providing the source may not want it to be made available
| across an array of dropbox-style online/networked systems just
| to accommodate an unlikely black swan event.
|
| A compromise could be to require that the source reside on the
| original backup media with multiple copies and media types
| available.
| spydum wrote:
| Not for a little while now. The escrow business was sold to
| NCC: https://www.nccgroup.com/us/newsroom/ncc-group-launches-
| esco...
| kmeisthax wrote:
| It's not a matter of incompetence, it's a matter of being very,
| very cheap.
|
| Artistic endeavors are a unique blend of "extremely chaotic
| workflows nobody bothers to remember the moment the work is
| 'done'", "90% of our output doesn't recoup costs so we don't
| want to burn cash on data storage", and "that one thing you
| made 20 years ago is now an indie darling and we want to
| remaster it". A lot of creatives and publishers were sold on
| the promise of digital 30-odd years ago. They recorded their
| masters - their "source code" - onto formats they believed
| would be still in use today. Then they paid Iron Mountain to
| store that media.
|
| Iron Mountain is a safe deposit box on steroids, they use
| underground vaults to store physical media. You store media in
| Iron Mountain if you want that specific media to remain safe in
| any circumstance[0], but that's a strategy that doesn't make
| sense for electronic media. There is no electronic format that
| is shelf-stable and guaranteed to be economically readable 30
| years out.
|
| What you already know works is periodic remigration and
| verification[1], but that's an active measure that costs money
| to do. Publishers don't want to pay that cost, it breaks their
| business model, 90% of what they make will never be profitable.
| So now they're paying Iron Mountain even more for data recovery
| on the small fraction of data they care about. The key thing to
| remember is that _they don 't know what they need to recover at
| the time the data is being stored_. If they did, publishers
| wouldn't be spending money on risky projects, they'd have a
| scientific formula to create a perfect movie or album or TV
| show that would recoup costs all the time.
|
| [0] The original sales pitch being that these vaults were nuke-
| proof.
|
| [1] Your cloud provider does this automatically and that's
| built into the monthly fees you would pay. People who are
| DIYing their storage setup and using BTRFS or ZFS are using
| filesystems that automate that for online disks, but you still
| pay for keeping the disks online.
| akira2501 wrote:
| > It may sound like a sales pitch, but it's not; it's a call for
| action
|
| Your entire article sounds like a sales pitch. Your solution is,
| well, it's bad, but trust us, we can maybe recover it anyways.
| Otherwise your article fails to convey anything meaningful.
| derefr wrote:
| No, the call-to-action being referenced in the article is "stop
| archiving to hard drives" (and use tape instead, every other
| industry does.)
| ganoushoreilly wrote:
| I think it is or at least tries to be more than that. Not
| only stop archiving to HD's but understand the dependency
| requirements which is a whole secondary problem.
| natch wrote:
| The last time I used a tape it broke right off the reel the
| first time I used it. Good name brand tape drive, good name
| brand tape. I felt pretty burned by that experience after the
| money spent and the result.
| shiroiushi wrote:
| Tape drives aren't economical. If you're a big company, sure,
| they make sense, but for individuals and small companies,
| they really don't'. Hard drives are absolutely the only
| realistic and affordable way to back up data. They're not
| bulletproof though, so you need multiple hard drives, and you
| need a backup strategy that rotates them, so even if one
| fails, you haven't lost too much.
| nxobject wrote:
| As someone who only needs to backup 10TB, I looked once at
| getting into tape - the cost of getting a tape drive and a
| way to connect it to my computer was eye-watering. The very
| long-term prospects were even worse: I'd have to choose
| between buying multiple LTO-n tape drives for redundancy,
| or keep upgrading every 7ish years or so, which entails
| buying more tape drives.
|
| I'll stick with my 10 2TB hard drives, zfs, and biannual
| swaps to new hard drives, sadly. At least I won't ever have
| to deal with more than 10 hard drives at time, assuming
| $/GB never increases.
|
| At some point I should start sticking a Windows hard disk
| image on there; .vhd will still probably be readable and
| bootable in a VM 10 years from now.
| kevin_thibedeau wrote:
| Older generations of LTO are economical if your storage
| requirements are modest.
| kmeisthax wrote:
| Tape has it's own problems. LTO drives only have 2
| generations of backwards compatibility[0] and nobody makes
| new drives for old formats. So if you have a whole library of
| tapes you'll need to copy tapes over periodically to newer
| formats just to retain access to them, which is expensive.
|
| And once you start doing that, you've just quashed THE
| advantage tape had over disk. LTO doesn't provide any more
| reliability, it just shifts the failure points around.
| Instead of 20 year old sealed hard drives with bearings that
| will seize up and render your data unreadable, it'll be
| perfectly stable 20 year old tapes that no drive in the world
| can read. I'm also skeptical of the cost savings from cheap
| media once periodic remigrations are priced in, but it might
| still win out over disk for absolutely enormous libraries
| (e.g. entire Hollywood studio productions).
|
| And no, there isn't some other tape format that has better
| long-term support. Oracle stopped upgrading T10000 around
| 2017, and IBM 3592 has an even worse backwards compatibility
| story than LTO.
|
| [0] LTO-8 drives only have 1 generation of backwards
| compatibility because TMR heads get trashed by metal
| particulate tapes
| MisterTea wrote:
| Makes me wish we didn't stop advancing optical media technology
| to where we have cheap and reliable archival quality 1TB discs
| for a few bucks each. I guess LTO is the best option for
| personally controlled archival.
| 0cf8612b2e1e wrote:
| We haven't, but sadly the technology is locked to big tech.
|
| Microsoft has demoed some cool technology where they store data
| in glass, Project Silica. Sadly, it seems unlikely this will
| ever be available to consumers. One neat aspect of the design
| is that writing data is significantly higher power than
| reading. So you can keep your writing devices physically
| separated from the readers and have no fear that malicious code
| could ever overwrite existing data plates.
|
| Some blurbs Project Silica is developing the
| world's first storage technology designed and built from the
| media up to address humanity's need for a long-term,
| sustainable storage technology. We store data in quartz glass:
| a low-cost, durable WORM media that is EMF-proof, and offers
| lifetimes of tens to hundreds of thousands of years. This has
| huge consequences for sustainability, as it means we can leave
| data in situ, and eliminate the costly cycle of periodically
| copying data to a new media generation. We're re-
| thinking how large-scale storage systems are built in order to
| fully exploit the properties of the glass media and create a
| sustainable and secure storage system to support archival
| storage for decades to come! We are co-designing the hardware
| and software stacks from scratch, from the media all the way up
| to the cloud user API. This includes a novel, low-power design
| for the media library that challenges what the robotics and
| mechanics of archival storage systems look like.
|
| https://www.microsoft.com/en-us/research/project/project-sil...
| actionfromafar wrote:
| Now this is proper Sci-Fi tech! Data crystals, like 1960s
| Star Trek!
| yellow_postit wrote:
| I'm partial to their DNA work: https://www.microsoft.com/en-
| us/research/project/dna-storage...
| adrian_b wrote:
| That would be great, but after 7 years since the initial
| announcement it does not seem to be any closer of a
| commercial product.
| 0cf8612b2e1e wrote:
| Why would they sell it directly? Works better if they can
| advertise their one of a kind, super stable, cloud specific
| data archival solution that nobody else can replicate. Or
| not even advertise it, but maintain lower storage costs per
| byte relative to AWS or Google.
|
| As far as I know, the technology behind Amazon Glacier has
| never been shared. Glass disks could eventually be backing
| the Microsoft equivalent.
| knowaveragejoe wrote:
| I doubt the decisions on the product came down along that
| logic.
|
| Surely they could make more money by selling it in some
| form or another. If the economics actually gave them a
| storage cost advantage over AWS/GCP, then profitability
| must be possible.
|
| In reality it's probably incredibly expensive, and the
| ROI could not be obtained without even further investment
| to drive the costs down.
| shiroiushi wrote:
| >Why would they sell it directly? Works better if they
| can advertise their one of a kind, super stable, cloud
| specific data archival solution that nobody else can
| replicate.
|
| Because network speeds aren't high enough to back up
| terabytes of data remotely on a regular basis. This would
| only work if you already store all your data with this
| vendor, which is probably a stupid move.
| Twirrim wrote:
| Optical media is neat, but has a number of drawbacks when it
| comes to large scale operations.
|
| What you're talking about already sort of exists, albeit media
| hadn't reached "cheap" yet, because the manufacturing scale
| wasn't there. People weren't interested enough in it. Archival
| Disc was a standard that Sony and Panasonic produced,
| https://en.wikipedia.org/wiki/Archival_Disc. Before the
| standard was retired you could by gen3 ones with 5.5TB of
| capacity, https://pro.sony/ue_US/products/optical-disc-archive-
| cartrid...
|
| LTO tape was already at 15TB by the time their 300GB Discs came
| out, and reached 45TB capacity 3 years ago. Tape is still leaps
| and bounds ahead of anything achievable in optical media _and_
| isn 't write-once. (https://en.wikipedia.org/wiki/Linear_Tape-
| Open)
|
| Part of the problem is you can't just store and forget, you
| have to carry out fixity checks on a regular basis
| (https://blogs.loc.gov/thesignal/2014/02/check-yourself-
| how-a...). Same thing as with your backups, backups that don't
| have restores tested aren't really backups, they're just
| bitrot. You want to know that when you go to get something
| archived, it's actually there. That means you're having to load
| and validate every bit of media on a very regular basis,
| because you have to catch degradation before it's an issue.
| That's probably fine when you're talking a handful of discs,
| but it doesn't scale that well at all.
|
| The amount of space that it takes for the drives to read the
| optical disc, the machinery to handle the physical automation
| of shuffling discs around etc. combined with the costs of it,
| just make no sense compared to the pre-existing solutions in
| the space. You don't get the effective data density (GB/sq
| meter) you'd need to make it make sense, nor do the drives come
| at any kind of a price point that could possibly overcome those
| costs.
|
| To top it all off, the storage environment conditions of
| optical media isn't really any different from Tape, except
| maybe slightly less sensitive to magnetic interference.
| shiroiushi wrote:
| >LTO tape was already at 15TB by the time their 300GB Discs
| came out, and reached 45TB capacity 3 years ago.
|
| No, they didn't. The largest LTO tape is only 18TB; your
| numbers are bogus. Those are BS advertised numbers with
| compression. If you're storing a bunch of movies or photos,
| for instance, you can't compress that data any further. The
| actual amount of data that the medium can physically store is
| the only useful number when discussing data storage media.
| Twirrim wrote:
| That's fair. The LTO capacity was already still
| significantly larger than archive disc at any stage in
| archive disc's life cycle.
|
| Both Sony and Panasonic completely failed to demonstrate
| actual value from the format. Smaller capacity, for the
| same kinds of environmental constraints, similar size
| drives etc. There was just no reason to actually use it.
| shiroiushi wrote:
| Yeah, it's really too bad someone hasn't made a
| reasonably-priced archival format that consumers and
| small businesses can use, because LTO isn't it. The
| closest they have is MDISC, but the storage capacity is
| small, and from what I'm reading, discs advertised with
| this aren't necessarily all that long-lived anyway (if
| they're using dye).
|
| What we need is a cheap, write-once format that can hold
| at least 1TB, similar to how we used to use CD-Rs 20-25
| years ago, but without organic dye like those discs and
| with a far longer shelf life.
| everfrustrated wrote:
| It's called the cloud or Backblaze. Encrypt with your own
| keys and let them worry about moving the data to new
| drives every so often.
| kiririn wrote:
| LTO specifying compressed capacity is a little forgivable
| when you consider it is transparent and computationally
| free. Nothing else quite like it other than filesystem
| level compression
| netrap wrote:
| Unfortunately recordable optical is on it's way out. Sony
| recently slashed the staff at the Japan plant that makes BD-R's
| (BD-R XL's). Still CMC makes CDR, DVDR, BDR though.
| kevin_thibedeau wrote:
| 25 years ago, Kodak was working on a silver halide tape media
| that was supposed to be good for 100+ years. It sadly withered
| and never came to market.
|
| https://group47.com/Introduction_to_DOTS_WEB_11-23.pdf
| nayuki wrote:
| According to their video (
| https://vimeo.com/502475794/ffbfb82b15 ), the company
| patented bit plane image storage. What the heck? That is so
| obvious and shouldn't be patentable.
|
| On a side note, they keep touting how robust their data
| archival solution is. But I have my doubts. For example, if
| an image has a big patch of 0 or 1 bits, then it might be
| impossible to accurately align the bit positions
| ("reclocking"); this is the same issue with QR codes and why
| they have a masking (scrambling) technique. Another problem
| is that their format doesn't seem to mention error correction
| codes; adding Reed-Solomon ECC is an essential technique in
| many, many popular formats already.
| shiroiushi wrote:
| No, LTO isn't a viable option at all for most people: it's
| simply far too expensive. The drives themselves cost thousands
| of dollars each.
| simonw wrote:
| My understanding is that the only reliable way of long-term
| digital archival storage is to refresh the media you are storing
| things on every few years, copying the previous archives to the
| fresh storage.
|
| Since storage constantly gets cheaper, 100GB first stored in 2001
| can be stored on updated media for a fraction of that original
| cost in 2024.
| abracadaniel wrote:
| Pretty much. You see hobbyists getting data off of 30+ year old
| hard drives for the novelty of it, but I can't imagine relying
| on that as a preservation copy. Optical media rots, magnetic
| media rots and loses magnetic charge, bearings seize, flash
| storage loses charge, etc. Entropy wins, sometimes much faster
| than you'd expect.
| animal531 wrote:
| Sometimes they fail for other reasons as well, such as
| improper storage.
|
| Back in the 90's to 00's a friend had a collection of cd's
| that he'd written, but he stored them in a big sleeved folder
| container. The container itself caused them to warp slightly,
| which made them unusable.
|
| I took a few for testing and managed to unbend them after
| some time, which turned them back into a working state.
|
| [Note: That's the most apostrophes I've ever used in a
| sentence, it feels dirty]
| aimor wrote:
| A little dirty according to CMOS. Ought to be '90s not
| 90's, cds not cd's.
|
| https://www.chicagomanualofstyle.org/book/ed18/part2/ch09/p
| s...
|
| https://www.chicagomanualofstyle.org/qanda/data/faq/topics/
| P...
| NikkiA wrote:
| Paper remains the most effective long term storage.
| elzbardico wrote:
| ergo, my suggestion of archiving stuff in punch cards.
| dathinab wrote:
| interestingly this is how long term cold tape storage works
| more or less (in case of taps you have a bit different failure
| characteristics so it's more like "check read" at least every
| "some_time" and on checksum errors rewrite to new tape
| restoring from "raid" duplicates, but conceptual it's kinda the
| same idea)
| hooli42 wrote:
| If it does't have to be offline for long durations, software
| raid + adding a new drive every once in a while, and discarding
| failing drives is pretty foolproof.
|
| AFAIK large data centers automate something like this.
| sam_goody wrote:
| For a while we were being sold on CDs on a more permanent
| medium, such a M-disks.
|
| Assuming you store your own players, and have a convertor from
| USB to whatever exists in fifty years, is that a real solution?
| Loic wrote:
| Long term archival is successive short/middle term archival.
|
| I think I read this quote on Tim Bray's blog[0], but I am not
| sure anymore. This is now my approach, my short/middle term
| archival is designed to be easily transferred to the next
| short/middle term store on a regular basis. I started with
| 500GB drives, now I am at 14TB.
|
| [0]: https://www.tbray.org/ongoing/
| zh3 wrote:
| My first hard drive was 5Mb, and I had to write my own driver
| for it (PDP11, c. 1982). It was a hell of a step up from 8"
| floppies, enough so I partitioned it into 8 separate areas.
|
| Even the floppies were a step up from paper tape - the older
| guys used to have a cupboard of paper tapes on coathangers,
| and linked their code by feeding the tapes through the reader
| in the right order.
|
| Kids these days etc :)
| cherrycherry98 wrote:
| M-Disc is a digital optical medium for archival storage. Not
| indestructible but more resilient to degradation than the
| typical BD/DVD-R.
| kevin_thibedeau wrote:
| All BD-R use the same inorganic recording as M-disc. It's
| only worthwhile for DVD and CDROM.
| chrisco255 wrote:
| Are you suggesting blueray will last as long as M-disc?
| RajT88 wrote:
| I'm not sure that's true. I would love it if so!
|
| https://www.canada.ca/en/conservation-
| institute/services/con...
|
| Just one article discussing it. Do you have a source to
| back this up? M-DISCs are getting hard to purchase these
| days, and I have a lot of stuff I want to put on them which
| I likely will want to look at in 30 years.
| mercurialuser wrote:
| There are several articles about film preservation in digital
| format. Every X years all the data is "upgraded", from LTOn to
| LTOn+1 or +2.
|
| So it may sound like a sales pitch but I consider it more a
| warning notice
| esafak wrote:
| That means the hardware _and_ the file format.
| Clamchop wrote:
| They have lots of problems:
|
| 1. Incomplete copies with missing dependencies. 2. Old software
| and their file formats with a poor virtualization story. 3. Poor
| cataloging. 4. Obsolete physical interfaces, file systems, etc.
| 5. Long-term cold storage on media neither proven nor marketed
| for the task.
|
| Managing archives is just a cost center until it isn't, and it's
| hard to predict what will have value. The worst part of this is
| that TFA discusses mostly music industry materials. Outside
| parties and the public would have a huge interest in preserving
| all this, but of course it's impossible. All private,
| proprietary, copyrighted, and likely doomed to be lost one way or
| another.
|
| Oh well.
| cookiengineer wrote:
| Related documentary that comes to mind: Digital Amnesia (2014)
| [1]
|
| It broke my heart seeing those librarians in disbelief when
| their national library was sold off to the highest bidder. When
| they said "It seems our country does not value our own culture
| anymore".
|
| Books lasted hundreds of years. Good luck trying to read a
| floppy from the 90s, or even DVDs that are already beyond their
| lifetime and are a very recent medium.
|
| It gets worse when you read the fine print of the SSD
| specifications, wherein they state that an SSD may lose all its
| data after 2 weeks without power, and data retention rates are
| at less than 99%, meaning they will degrade after the first
| year of use. And don't get me started on SMR HDDs, I lost
| enough drives already :D
|
| Humanity has a backup problem. We surely live in Orwellian
| times because of it.
|
| [1] https://youtube.com/watch?v=NdZxI3nFVJs
| h4ck_th3_pl4n3t wrote:
| I was curious whether these claims were true or not, so I
| looked it up.
|
| https://images.samsung.com/is/content/samsung/assets/pl/memo.
| ..
|
| Damn. 3 months for my SSD.
| stogot wrote:
| I've had SSDs sitting for years thought it was safe. Should
| I be worried or is this a liability warning?
| justinclift wrote:
| Be worried. :(
| hooli42 wrote:
| Sitting - worrisome.
|
| Being used without too much writing -- just fine.
| elzbardico wrote:
| The data is gone by now.
| cookiengineer wrote:
| > 3 months for my SSD
|
| Younger me thought I was smart repurposing my SSDs as
| shockproof 2.5" external backup drives. Suffice it to say
| that I was a year abroad, and came back to losing all my
| data because of it. I was able to recover some parts, but
| most of it was gone.
|
| Only buying CMR surveillance-class HDDs now for my backups.
| They're limited to 8TB for a 3.5" sized HDD, but that's far
| better as a compromise than the nightmare of losing all
| digital copies of tax documents that you have to keep -
| mandated by law - for at least 10 years.
|
| When that happened I had to renew my ID, and had to find
| the original birth certificate in paper form in the
| hospital's paper archive to get a signed copy, and had to
| go there with multiple relatives to prove my identity. Just
| to get my ID renewed. That incident surely made me realize
| how important backups are.
| shiroiushi wrote:
| >Good luck trying to read a floppy from the 90s
|
| The way I remember it, if you tried to read a floppy from the
| very early 90s, or from the 80s, you'd probably have no
| trouble at all, even many years later. You can probably still
| read floppies from the 80s without issue.
|
| However, if tried to read a floppy from the _late_ 90s, or
| 2000s, _even when the floppy was new_ , good luck! The
| quality of floppy disks and drives took a steep nose-dive
| sometime in the 90s, so even brand-new ones failed.
| ikari_pl wrote:
| This. I have a few hundreds of 80s floppies (especially the
| less popular 3 inch CF2 format), and some from the 90s.
| They read well, at least as long as you don't leave them in
| the drive when idle (the magnetic head may affect them!).
| But the last decades of floppies were of horrible quality.
| I remember them failing after a month.
| dghughes wrote:
| >Good luck trying to read a floppy from the 90s
|
| Also good luck trying to find a floppy drive. Yes, I'm sure
| you can buy one now but five or ten years from now? I'd say
| manufacture of the drives isn't exactly a booming business.
| Tor3 wrote:
| The vast majority of my 5 1/4" floppies (including "HD"
| 1.2MB ones) read just fine still. The vast majority (just
| about 100%) of my old 3.5" HD (1.44MB) floppies are
| unreadable. The 3.5" 720kB ones are mostly ok. Stored under
| the same conditions.
| lizknope wrote:
| Tape doesn't last forever either.
|
| https://en.wikipedia.org/wiki/Linear_Tape-Open#Generations
|
| LTO-1 started in 2000 and the current LTO-9 spec is from 2021.
| But it only has backwards compatibility for 1 to 2 generations.
| You can't read an LTO-6 tape in an LTO-9 drive.
|
| https://en.wikipedia.org/wiki/Sticky-shed_syndrome
|
| > Sticky-shed syndrome is a condition created by the
| deterioration of the binders in a magnetic tape, which hold the
| ferric oxide magnetizable coating to its plastic carrier, or
| which hold the thinner back-coating on the outside of the
| tape.[1] This deterioration renders the tape unusable.
|
| Stiction Reversal Treatment for Magnetic Tape Media
|
| https://katalystdm.com/digital-transformation/tape-transcrip...
|
| > Stiction can, in many cases, be reversed to a sufficient
| degree, allowing data to be recovered from previously unreadable
| tapes. This stiction reversal method involves heating tapes over
| a period of 24 or more hours at specific temperatures (depending
| on the brand of tape involved). This process hardens the binder
| and will provide a window of opportunity during which data
| recovery can be performed. The process is by no means a permanent
| cure nor is it effective on all brands of tape. Certain brands of
| tape (eg. Memorex Green- see picture below) respond very well to
| this treatment. Others such as Mira 1000 appear to be largely
| unaffected by it.
|
| Data migration and periodic verification is the answer but it
| requires more money to hire people to actually do it.
|
| I've got files from 1992 but I didn't just leave them on a 3.5"
| floppy disk. They have migrated from floppy disk -> hard drive ->
| PD phase change optical disk -> CD-R -> DVD-R -> back to hard
| drive
|
| I verify all checksums twice a year and have 2 independent
| backups.
| adrian_b wrote:
| While tape does not last forever, the LTO tapes are specified
| for at least 30 years.
|
| The more serious problem is as you say that the older drives
| become obsolete. Even so, if you start using an up-to-date LTO
| format you can expect that suitable new tape drives will be
| available for buying at least 10 years in the future.
|
| For HDDs, the most that you can hope is a lifetime of 5 years,
| if you buy the HDDs with the longest warranties.
| lizknope wrote:
| I absolutely agree that the average tape will last longer
| than the average hard drive.
|
| I've got 30 hard drives in use right now and at least 10 are
| older than 5 years. A few are over 7 years old. I've also had
| hard drives die in less than a week.
|
| Even if the data is on tape I want to emphasize that the tape
| needs to be periodically read and verify that the data is
| still readable and correct. Assuming the data is stable for
| 30 years and you can just leave it there is a dumb idea
| unless you didn't care about the data in the first place.
| adrian_b wrote:
| The same is true for HDDs or for any other data storage
| devices.
|
| I have stored data for more than 5 years on HDDs, but
| fortunately I have been careful to make a duplicate for all
| HDDs and I did not trust the error-correction codes used by
| the HDDs, so all files were stored with hashes of the
| content, for error detection.
|
| On the HDDs on which data had been stored for many years,
| only seldom I did not see any error. Nevertheless, I have
| not lost any data, because the few errors never happened in
| the same place on both HDDs. The errors have been sometimes
| reported by the HDDs, but other times no errors were
| reported and nonetheless the files were corrupted, as
| detected by the content hashes and by the comparison with
| the corresponding good file that was on the other HDD.
| lizknope wrote:
| I also make checksums for every file and verify twice a
| year. Over millions of files totalling 450TB I end up
| getting about 1 failed checksum every 2 years. If you are
| having more frequent checksum failures I would check for
| RAM errors first.
|
| zfs and btrfs would do this automatically and have built
| in data scrub commands.
| at_a_remove wrote:
| I have a checksum, uh, dream for lack of a better word,
| but I fear I lack the talent to pull it off. The problem
| with a checksum is that it only tells you that an error
| exists (hopefully), but it does not tell you where.
|
| Imagine writing out your stream of bytes in a m x n grid.
| You could then make checksums for each 1 through m
| columns, and 1 through n rows. This results in an
| additional storage of (m + n) checksum bytes. A single
| error is localized as an intersection of the row and
| column with bad checksums. One could simply iterate
| through the other two hundred fifty-five possibilities
| and _correct_ the issue. Two errors could give two
| situations. The most likely is four bad checksums (two
| columns, two rows) and you could again iterate. The less
| likely is three bad checksums because the two errors are
| in the same row or column.
|
| I ran the math out for the data being rewritten as a
| volume and a hypervolume (four dimensions). I think the
| hypervolume was "too much" checksum, but the three-d
| version looked ... doable.
|
| Someone smarter than I has probably already done this.
| aspenmayer wrote:
| This sounds like par2 with extra/different steps but I
| don't know much about this space.
|
| https://en.wikipedia.org/wiki/Parchive
| justsomehnguy wrote:
| > and correct the issue
|
| Just use RAR with recovery record.
| Spooky23 wrote:
| I did some work for a place that had 30 year retention
| requirements for lots of data. (And indefinite for others)
|
| Iirc, the goal was to turnover to new tape media every 8-10
| years.
| wazoox wrote:
| I have restored a few hundreds LTO-1 and 2 tapes using an LTO-3
| drive a few years ago. If you keep the drives around and run
| Linux (which supports obsolete hardware better), keeping LTO
| tapes 10, 15, 20 years is not a problem at all.
|
| A few weeks ago I wrote for a customer a restore utility for
| LTO-4/5/6 made with a now-deceased archival system from a
| deceased software company. Most of these tapes are up to 16
| years old, have been kept in ordinary office cupboards, and
| work perfectly fine.
|
| But you're right that archival isn't much about the media, but
| is a process. "Archive and forget" isn't the way.
| bogwog wrote:
| This article is too vague. It sounds like they're talking about
| the physical drive not working, but they're giving examples where
| you can't playback because you need to install the correct old
| software, plugins, etc... Which doesn't have anything to do with
| hard drives.
|
| So what's actually wrong with hard drives for archival? Do they
| deteriorate? Do they "rot" like DVDs/blurays/etc have been known
| to do? Or is this just an ad for their archival service?
| wmf wrote:
| Hard drives are known to suffer "sticktion" where the heads get
| stuck to the platter and either the drive won't spin up or it
| spins up and the heads damage the platters. I imagine hard
| drives could also have bad capacitors but I haven't heard of
| that happening.
| shiroiushi wrote:
| >I imagine hard drives could also have bad capacitors but I
| haven't heard of that happening.
|
| That's very unlikely. If you're thinking of the "capacitor
| plague" of the 2000s, that only affected electrolytic
| capacitors, since it was caused by the Chinese poorly copying
| the formula for capacitor electrolyte. I don't believe hard
| drives used electrolytic capacitors in that time period,
| simply due to their size, though I could be wrong.
| bogwog wrote:
| That seems like it could be solved by (carefully)
| disassembling the drive for long term storage, adding a thin
| piece of paper or tape under the heads, etc.
| wmf wrote:
| Disassembling a drive just allows dust to get in and cause
| more damage. Come to think of it, in recent drives the
| heads fully unload onto a ramp so they're probably less
| likely to stick.
| zh3 wrote:
| In the old drives of 5" HDDs the head stepper motor shaft was
| external, and if a drive got stuck a slight twist of the
| stepper shaft would unstick the heads after which the drive
| would spin up (well, as long as it didn't rip heads off the
| HDA so it was always a calculated risk).
|
| Happened to me when I got a call out to a large UK outfit
| who'd have an extended power cut and knew recovery was going
| to be fun. First stop was a particularly critical PC which
| had exactly this problem, so open the case, touch the HDD
| just right and off it went - happy with that, and to the next
| item.
|
| Anyway, the recovery operation went well, and this particular
| incident came floating back by way of a hushed comment from a
| manager a few years later about this tech who'd come in to
| help with the recovery, and who'd "...laid his hands on the
| PC, and it came back to life!" :)
| mystified5016 wrote:
| Magnetic HDDs do suffer bit rot, yes. But perhaps more
| importantly the mechanisms suffer physical failure over time.
| You can't just pop the platters into a new drive, even if you
| had an identical model.
|
| That's really the main disadvantage of hard drives: the media
| is permanently coupled to the drive. If your tape drive fails,
| you can just pop the tape into a working drive and still get
| your data back.
| quesera wrote:
| > _You can 't just pop the platters into a new drive, even if
| you had an identical model_
|
| It's certainly inconvenient, but this is my untested
| understanding of how drive recovery services can work.
| qingcharles wrote:
| With older drives, pre-about-2010 I think, you can, as I
| understand it.
|
| After that they added little NVRAM chips to the boards
| which hold data about the disk, so you need to make sure
| they match. I just fried a HDD controller with a bad SATA
| cable, so I'm having to switch the chip from one board to
| another to try to recover the data.
| kkfx wrote:
| People should learn a thing: data are not tied to the physical
| media hosting them, like words on paper, and the sole way to
| preserve data is migrating them from a physical support to
| another regularly, also converting their formats sometimes,
| because things changes and an old format could end up unreadable
| in the future.
|
| We can't preserve bits like books.
| Thrymr wrote:
| > We can't preserve bits like books.
|
| The only reason we have any copies of "books" (i.e. long
| written works) from the ancient world is that they were
| painstakingly copied over centuries from one medium to another,
| by hand for most of that time.
| kkfx wrote:
| Definitively right, above I intend we can't preserve bits
| simply collecting them in libraries of physical supports like
| we do with shelves full of books or folders of archived
| sheets. With bits there is no "original" and "copy", any copy
| is still "original", we do not loose information or introduce
| changes because of that.
| hulitu wrote:
| > Of the thousands and thousands of archived hard disk drives
| from the 1990s that clients ask the company to work on, around
| one-fifth are unreadable
|
| Some 25 years ago, the hardest part in booting some Apollo
| workstations, was to make hard drives spin.
| bell-cot wrote:
| Policy at $Job - all _important_ data is backed up to a rotation
| of high-quality hard drives. Which are stored off-site, powered
| down. Every N weeks, each one of them is powered up (in an off-
| line system) and checked - both with the SMART long test, and
| `zfs scan` (which verifies ZFS 's additional anti-bit-rot
| checksums for the data).
|
| Yes, it's a bit of a PITA. OTOH, modern HD's are huge, so a
| relative few are needed. And we've lost 0 bits of our off-site
| data in our >25 years of using that system.
| nayuki wrote:
| I had to do a double take on this, as I associate Iron Mountain
| as the brand that shreds papers and hard drives as their most
| common service.
| RobRivera wrote:
| The second M doesn't help.
|
| Almost like the title was purposely crafted to mislead you to
| draw eyeballs.
| 1oooqooq wrote:
| Well, this one is storing spinners POWERED DOWN... So it's
| petty much a slower data wiping service :shrug_emoji
| thomassmith65 wrote:
| Relevant: _" Sony Is Killing the Blu-ray"_
| https://news.ycombinator.com/item?id=40880077
|
| Ah, this stupid industry.
| sgarland wrote:
| Related, CD-Rs. When I left my submarine in 2013, they (by which
| I mean the entire Virginia class) were still using them to store
| archived logs, despite my explanation that they'd be lucky to get
| a decade out of them. The first chosen storage location was
| literally the hottest part of the engine room, right in between
| the main engines. Easily 120+ F at all times. After protest, we
| moved ours to a somewhat cooler location. Still hot, and still
| with atmospheric oil and other fun chemicals floating around.
|
| I look forward to the first time logs from a few decades ago are
| required, and the media is absolutely dead.
|
| EDIT: they weren't even Azo dye, they were phthalocyanine. A
| decade was probably generous.
| hunter-gatherer wrote:
| Knowing nothing of submarines or seafaring, I'm genuinely
| curious as to what is logged on a ship that may be necessary a
| decade later?
| Mountain_Skies wrote:
| No idea what they actually log but it seems like performance
| data from the propulsion system under a wide range of
| conditions would be useful when designing the next generation
| of such systems.
| sgarland wrote:
| Everything. Someone could bring a lawsuit up years later, and
| logs would be necessary to determine if they had standing.
|
| The aforementioned optical media storage was specific to the
| nuclear reactor and electric plant; I think everyone else's
| were stored differently. Not positive.
|
| EDIT: sibling comment below mentions performance data. Yes,
| that too. I graphed (nuclear) fuel consumption on one
| underway, and was surprised to find it didn't match expected.
| My Captain was also surprised, and thrilled, because it meant
| he got to be more important (fair enough; who doesn't want to
| be listened to by their boss?)
| bushbaba wrote:
| If lawsuit related. Maybe they wanted the backups to fail?
| tivert wrote:
| > Knowing nothing of submarines or seafaring, I'm genuinely
| curious as to what is logged on a ship that may be necessary
| a decade later?
|
| I noted in another comment that the National Archives say
| only "deck logs" are retained permanently, and it looks like
| this site lists what they contain: https://www.history.navy.m
| il/content/history/nhhc/research/a..., which includes all
| kinds of things.
|
| Stuff like "Actions [combat]", "Appearances of
| Sea/Atmosphere/Unusual Objects", "Incidents at Sea",
| "Movement Orders", "Ship's Behavior [under different
| weather/sea conditions]", "Sightings [other ships; landfall;
| dangers to navigation]" seem like they'd be useful for
| history and other kinds of research.
|
| Stuff like "Arrests/Suspensions", "Courts-Martial/Captain's
| Masts", "Deaths" seem like the kind of legal records that are
| typically kept permanently.
|
| Stuff like "Soundings [depth of water]" were probably
| historically useful for map-making.
| mrandish wrote:
| Good question. As far as I'm aware, outside of a few special
| circumstances (like birth records), the vast majority of
| legal record preservation requirements are seven years with
| some being as long as ten years. Of course, with the service
| life of some military ships and aircraft now being stretched
| >30-40 years, I can imagine it might be useful to have
| records of component failures and replacements, especially
| for statistical modeling.
|
| In the case of a ship (or sub), I'd assume that they'd rotate
| optical media archives off the vessel every year or two and
| transfer them to some central database. After all, a vessel
| can be lost and the data is also useful in the aggregate.
| 1oooqooq wrote:
| That sounds like it was very much by design and nobody wanted
| those logs to survive
| tedivm wrote:
| The purpose of a system is what it does.
|
| https://en.wikipedia.org/wiki/The_purpose_of_a_system_is_wha.
| ..
| tivert wrote:
| What kind of logs were they? According to
| https://www.archives.gov/research/military/logbooks/other-
| lo..., "Only a ship's Deck Log is retained as a permanent
| record." Stuff like engineering logs are only retained for
| three years.
| 6510 wrote:
| I was curious how some of the more wealthy yacht owners solved
| the marine puzzle. What kind of computer would they use? What
| kind of parts would go in? What would a basic system cost? So I
| asked one, he opened up a compartment with a stack of cheap
| Acer laptops vacuum sealed in bags. They last 2 to 6 months,
| when they stop working he throws them away. The sealed one has
| everything installed, a full battery and will sync as soon as
| internet becomes available. When plugged into something the new
| laptop is never the problem. He spend a small fortune arriving
| at this solution.
| crazygringo wrote:
| As counterintuitive as it may be, it seems to me like the only
| reliable long-term storage for data is with commercial cloud
| providers.
|
| Any time you're physically warehousing old hard drives and
| whatnot, they're going to be turning into bricks.
|
| Whereas with cloud providers, they're keeping highly redundant
| copies and every time a hard drive fails, data gets copied to
| another one. And you can achieve extreme redundancy and guard
| against engineering errors by archiving data simultaneously with
| two cloud providers.
|
| Is there any situation where it makes sense to be physically
| hosting backups yourself, for long-term archival purposes? Purely
| from the perspective of preserving data, it seems worse in every
| way.
| thadt wrote:
| ^ This. Physical media is continuously degrading. Large storage
| systems work by regularly reading, verifying, and replicating
| data - it is always doing backups and restores. If this isn't
| happening actively and regularly, your data will cease to exist
| at some point in time.
|
| Whether we collectively _need_ to store all these things is
| another question entirely. But if we want to keep it - we 'll
| have to do the work to keep it maintained.
| otabdeveloper4 wrote:
| > they're keeping highly redundant copies and every time a hard
| drive fails, data gets copied to another one
|
| Or so they say. It's not like you can double-check.
|
| > Is there any situation where it makes sense to be physically
| hosting backups yourself, for long-term archival purposes?
|
| Yes, political and legal risks. There's no guarantee your cloud
| won't terminate your account for any of a thousand reasons in
| the future.
| crazygringo wrote:
| > _It 's not like you can double-check._
|
| Don't they publish SLA numbers? The reliability of the major
| cloud providers seems quite well-established.
|
| > _There 's no guarantee your cloud won't terminate your
| account_
|
| Two cloud providers pretty much guarantees against that --
| the idea that two would terminate it simultaneously is
| vanishingly small.
| throwway120385 wrote:
| Cloud provider stats are based on aggregate numbers. If
| they claim 99.999% of all data is retained, and they have
| 100,000 TB of data collectively, then if they lose your
| entire 1 TB of data then they can still claim that they
| maintain 99.999% of data as long as they don't lose anyone
| else's data.
| telotortium wrote:
| > Two cloud providers pretty much guarantees against that
| -- the idea that two would terminate it simultaneously is
| vanishingly small.
|
| Ask Julian Assange about that. Sure, the US government
| claimed he had committed a crime, but he disagreed. You
| really need to store your data in at least two of (a) NATO
| and Western-aligned countries (b) Russia and aligned
| countries (c) Mainland China, and that's assuming the prior
| probability of you being at risk in those blocs is low.
| It's hard to avail yourself of this if you're a legitimate
| company, but plausible if you're a private citizen.
| TZubiri wrote:
| I'm launching a competitor for Iron Mountain. It's called DevNull
| LLC. Just send us your files! We'll take care of it don't worry.
| jwsteigerwalt wrote:
| I grew up in Pittsburgh. When I was flying in and out of the
| Pittsburgh airport (usually to Atlanta) during and after college,
| I would often see Iron Mountain uniformed employees waited for
| standby seats carrying their little pelicans cases...
| steve3055 wrote:
| It sounds like the only reliable backup media is punched paper
| tape.
| steve3055 wrote:
| I once pressed my boss into having off-premises storage of
| documents so we could still manufacture product in a new facility
| if the current facility burnt down. Unfortunately, someone
| started the habit of sending the primary documents to the same
| facility if the product was deprecated. One day, that off-
| premises facility burnt down and all the contents was lost. I
| think it was a regular self-storage space.
|
| That aside, this sounds extremely old-fashioned, but it seems to
| me that the only media that is acceptable for long-term storage
| is going to be punched paper tape. How long does paper last? How
| long do the holes in it remain readable? Can it be spliced and
| repaired?
| elzbardico wrote:
| The solution is metallic punch cards.
___________________________________________________________________
(page generated 2024-09-11 23:01 UTC)