[HN Gopher] The long road to recover Frogger 2 source from tape ...
___________________________________________________________________
The long road to recover Frogger 2 source from tape drives
Author : WhiteDawn
Score : 249 points
Date : 2023-05-24 17:41 UTC (5 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| FearNotDaniel wrote:
| > the ADR-50e drive was advertised as compatible, but there was a
| cave-at
|
| I'm assuming the use of "cave-at" means the author has inferred
| an etymology of "caveat" being made up of "cave" and "at", as in:
| this guarantee has a limit beyond which we cannot keep our
| promises, if we ever find ourselves AT that point then we're
| going to CAVE. (As in cave in, meaning give up.) I can't think of
| any other explanation of the odd punctuation. Really quite
| charming, I'm sure I've made similar inferences in the past and
| ended up spelling or pronouncing a word completely wrong until I
| found out where it really comes from. There's an introverted
| cosiness to this kind of usage, like someone who has gained a
| whole load of knowledge and vocabulary from quietly reading books
| without having someone else around to speak things out loud.
| nocoiner wrote:
| I thought it might have been a transcription error of "carve
| out," but your theory is more logical.
| huehehue wrote:
| Fascinating read that unlocked some childhood memories.
|
| I'm secondhand pissed at the recovery company, I have a couple of
| ancient SD cards laying around and this just reinforces my fear
| that if I send them away for recovery they'll be destroyed (the
| cards aren't recognized/readable by the readers built into
| MacBooks, at least)
| ryanjshaw wrote:
| Painful lesson I've learned myself the hard way - don't rush
| something that doesn't need to be rushed.
| ogurechny wrote:
| Modern backup would simply state "API keys and settings are
| here:", and a link to collaboration platform closed after 3 years
| of existence.
| jandrese wrote:
| Hey, it's the cloud. Backups are "someone else's problem". That
| is until they are your problem, then you're up a creek.
| tivert wrote:
| > Hey, it's the cloud. Backups are "someone else's problem".
| That is until they are your problem, then you're up a creek.
|
| The FSF used to sell these wonderful stickers that said
| "There is not cloud. It's just someone else's computer."
| isaidthis wrote:
| The sticker:
| https://static.fsf.org/nosvn/stickers/thereisnocloud.svg
|
| "Stickers from various FSF campaigns - Print out copies of
| our stickers for your own uses, local conferences and
| more." https://www.fsf.org/resources/stickers
| ilyt wrote:
| Honestly backup space is weirdly sparse for anything on
| enterprise scale.
|
| For anything more than few machines there is bacula/bareos
| (that pretends everything is tape with mostly miserable
| results), backuppc (that pretends tapes are not a thing, with
| miserable results), and that's about it, everything else seems
| to be point-to-point backups only with no real central
| management.
| dllthomas wrote:
| On the topic of Froggers, I enjoyed
| https://www.youtube.com/watch?v=FCnjMWhCOcA
| smokel wrote:
| Heh, I remember playing .mp3 files directly from QIC-80 tapes,
| somewhere around 1996. One tape could store about 120 MB, which
| is equal to about two compact discs' worth of audio. The noise of
| the tape drive was slightly annoying, though. And it made me
| appreciate what the 't' in 'tar' stands for.
| mjaniczek wrote:
| Did you mean 1200 MB? That would make sense wrt. 2x CD
| capacity.
| smokel wrote:
| No, it was really only 120 MB. I was referring to the length
| of an audio compact disc, not the capacity of a CD-ROM. At
| 128 kbps, you'd get about 2 hours of play time.
|
| Of course it didn't really make sense to use digital tapes
| for that use case, even back then. It was just for fun, and
| the article sparked some nostalgic joy, which felt worth
| sharing :)
| [deleted]
| [deleted]
| stewarts wrote:
| They reference MP3, and a CD ripped down to MP3 probably fits
| in the 50-100MB envelope for size. It has been a very long
| time since I last ripped an album, but that size jives with
| my memory.
| crazygringo wrote:
| Wow, this part makes my blood boil, emphasis mine:
|
| > This issue doesn't affect tapes written with the ADR-50 drive,
| but all the tapes I have tested written with the OnStream SC-50
| do NOT restore from tape _unless the PC which wrote the tape is
| the PC which restores the tape._ This is because the PC which
| writes the tape stores a catalog of tape information such as tape
| file listing locally, which the ARCserve is supposed to be able
| to restore without the catalog because it 's something which only
| the PC which wrote the backup has, _defeating the purpose of a
| backup._
|
| Holy crap. A tape backup solution that doesn't allow the tape to
| be read by any other PC? That's madness.
|
| Companies do shitty things and programmers write bad code, but
| this one really takes the prize. I can only imagine someone
| inexperienced wrote the code, nobody ever did code review, and
| then the company only ever tested reading tapes from the same
| computer that wrote them, because it never occured to them to do
| otherwise?
|
| But _yikes_.
| throw0101b wrote:
| > _Holy crap. A tape backup solution that doesn 't allow the
| tape to be read by any other PC? That's madness._
|
| What is needed is the backup catalog. This is fairly standard
| on a lot of tape-related software, even open source; see for
| example "Bacula Tape Restore Without Database":
|
| * http://www.dayaro.com/?p=122
|
| When I was still doing tape backups the (commercial) backup
| software we were using would e-mail us the bootstrap
| information daily in case we had to do a from-scratch data
| centre restore.
|
| The first step would get a base OS going, then install the
| backup software, then import the catalog. From there you can
| restore everything else. (The software in question allowed
| restores even without a license (key?), so that even if you
| lost that, you could still get going.)
| ilyt wrote:
| Right, the on-PC database act as index to data on the tape.
| That's pretty standard.
|
| But having format where you can't recreate the index from
| data easily is just abhorrently bad coding...
| tinus_hn wrote:
| Obviously to know what to restore, you need to index the data
| on the tapes. Tape is not a random access medium, there is no
| way around this.
|
| This is only for a complete disaster scenario, if you're
| restoring one PC or one file, you would still have the backup
| server and the database. But if you don't, you need to run
| the command to reconstruct the database.
| ShadowBanThis01 wrote:
| There is a way around this: You allocate enough space at
| the beginning (or the end, or both) of the tape for a
| catalog. There are gigabytes on these tapes; they could
| have reserved enough space to store millions of filenames
| and indices.
| IshKebab wrote:
| Wouldn't it make sense to _also_ write the backup catalog to
| the tape though? Seems like a very obvious thing to do to me.
| fsckboy wrote:
| you'd have to put the catalog at the end of the tape, but
| in that case you might as well rebuild the catalog by
| simply reading the tape on your way to the end (yeah, if
| the tape is partially unreadable blah blah backup of your
| backup...)
| Nextgrid wrote:
| I'd like to believe maybe that's why the company went out of
| business but that's just wishful thinking - a lot of
| incompetence is often ignored if not outright rewarded in
| business nowadays. Regardless, it's at least somewhat of a
| consolation those idiots did go out of business in the end,
| even if that's wasn't the root cause.
| Neil44 wrote:
| I'm familiar with needing to re-index a backup if it's accessed
| from a 'foreign' machine and sometimes the procedure is non-
| obvious but just not having that option seems pretty bad.
| bluedino wrote:
| I worked for an MSP a million years ago and we had a customer
| that thought they had lost everything. They had backup tapes
| but the backup server itself had died, after showing them the
| 'catalog tape' operation, and keeping their fingers crossed
| for a few hours, they bought me many beers.
| EvanAnderson wrote:
| I always had the Customer keep a written log of which tapes
| were used on which days. It helped for accountability but
| also prevented the "Oh, shit, we have to catalog all the
| tapes because the log of which tapes were used on which day
| are on the now-failed server."
| winrid wrote:
| It's basically an index stored on faster media. You would have
| redundancy on that media, too.
| readyplayernull wrote:
| A few months ago I was looking for an external backup drive and
| thought that SSD would be great because it's fast and shock
| resistant. Years ago I killed a Macbook Pro HD by throwing it on
| my bed from few inches high. Then I read a comment on Amazon
| about SSD losing information when unpowered for a long time. I
| couldn't find any quick confirmation in the product page, took me
| a few hours of research to find some paper about this phenomenon.
| If I remember correctly it takes a few weeks for the stored SSD
| to start losing its data. So I bought a mechanical HD.
|
| Another tech tip is not buying 2 backup devices from the same
| batch or even the same model. Chances being these will fail in
| the same way.
| vidarh wrote:
| To the last bit, I've seen this first hand. Had a whole RAID
| array of the infamous IBM DeathStar drives fail one after the
| other while we frantically copied data off.
|
| Last time I ever had the same model drives in an array.
| jimbob45 wrote:
| F2 was a really neat game. It almost invented Crypt of the
| Necrodancer's genre decades early.
|
| It's a little sad that it took such a monumental effort to bring
| the source code back from the brink of loss. It's times like that
| that should inspire lawmakers to void copyright in the case that
| the copyright holders can't produce the thing they're claiming
| copyright over.
| LeoPanthera wrote:
| I really wish they would name the data recovery company so that I
| can never darken their door with my business.
| bluedino wrote:
| > Over the span of about a month, I received very infrequent
| and vague communications from the company despite me providing
| extremely detailed technical information and questions.
|
| Ahh the business model of "just tell them to send us the tape
| and we'll buy the drive on eBay"
| Nextgrid wrote:
| To be honest as long as they are very careful about not doing
| any damage to the original media then it might work and be a
| win-win for both sides in a "no fix no fee" model where the
| customer only pays if the data is successfully recovered.
|
| Their cardinal sin was that they irreparably damaged the tape
| without prior customer approval.
| nickt wrote:
| It's not too hard to find with the following search, "we can
| recover data from tape formats including onstream"
| stepupmakeup wrote:
| The OP explicity didn't name them (despite many people
| recommending to, even preservationists in this field on
| Reddit and Discord) but it's easy to find just by googling
| the text on the screenshots
| ddtaylor wrote:
| Name them and we can setup a thread or site to publicly
| shame them
| stepupmakeup wrote:
| the comment I replied to edited the link out
| https://www.datarecovery.net/tape-data-recovery.aspx
| omoikane wrote:
| Reddit thread: https://www.reddit.com/r/DataHoarder/comment
| s/13q1pv7/playst...
| a1369209993 wrote:
| https://news.ycombinator.com/item?id=36063114 claims it's
| https://www.datarecovery.net/tape-data-recovery.aspx (and that
| https://news.ycombinator.com/item?id=36062785 had been edited
| to censor the information, so I'm dupicating it here). Caveat
| that I don't know if that's actually correct, since efforts to
| suppress it are only circusantial evidence in favor.
| omnibrain wrote:
| Is anyone else calling it "froggering/to frogger" if they have to
| cross a bigger street by foot without a dedicated crossing?
| hlandau wrote:
| Absolutely amazing story. Fantastic!
|
| I've actually long been stunned by the propensity of proprietary
| backup software to use undocumented, proprietary formats. I've
| always found this quite stunning, in fact. It seems to me like
| the first thing one should make sure to solve when designing a
| backup format is to ensure it can be read in the future even if
| all copies of the backup software are lost.
|
| I may be wrong but I think some open source tape backup software
| (Amanda, I think?) does the right thing and actually starts its
| backup format with emergency restoration instructions in ASCII. I
| really like this kind of "Dear future civilization, if you are
| reading this..." approach.
|
| Frankly nobody should agree to use a backup system which
| generates output in a proprietary and undocumented format, but
| also I want a pony...
|
| It's interesting to note that the suitability of file formats for
| archiving is also a specialised field of consideration. I recall
| some article by someone investigating this very issue who argued
| formats like .xz or similar weren't very suited to archiving.
| Relevant concerns include, how screwed you are if the archive is
| partly corrupted, for example. The more sophisticated your
| compression algorithm (and thus the more state it records from
| longer before a given block), the more a single bit flip can
| result in massive amounts of run-on data corruption, so better
| compression essentially makes things worse if you assume some
| amount of data might be damaged. You also have the option of
| adding parity data to allow for some recovery from damage, of
| course. Though as this article shows, it seems like all of this
| is nothing compared to the challenge of ensuring you'll even be
| able to read the media at all in the future.
|
| At some point the design lifespan of the proprietary ASICs in
| these tape drives will presumably just expire(?). I don't know
| what will happen then. Maybe people will start using advanced
| FPGAs to reverse engineer the tape format and read the signals
| off, but the amount of effort to do that would be astronomical,
| far more even than the amazing effort the author here went to.
| hlandau wrote:
| To add, thinking a bit more about it: Designing formats to be
| understandable by future civilizations actually reduces to a
| surprising degree to the same set of problems which METI has to
| face. As in, sending signals designed to be intelligible to
| extraterrestrials - Carl Sagan's Contact, etc.
|
| Even if you write an ASCII message directly to a tape, that
| data is obviously going to be encoded before being written to
| the tape, and you have no idea if anyone will be able to figure
| out that encoding in future. Trouble.
|
| What makes this particularly pernicious is the fact that LTO
| nowadays is a proprietary format(!!). I believe the spec for
| the first generation or two of LTO might be available, but last
| I checked, it's been proprietary for some time. The spec is
| only available to the (very small) consortium of companies
| which make the drives and media. And the number of companies
| which make the drives is now... two, I think? (They're often
| rebadged.) Wouldn't surprise me to see it drop to one in the
| future.
|
| This seems to make LTO a very untrustworthy format for
| archiving, which is deeply unfortunate.
| rootsudo wrote:
| Name and shame the company, you had a personal experience, you
| have proof. Name and shame. It helps nobody if you don't
| publicize it. Let them defend it, let them say whatever excuse,
| but your review will stand.
| phkahler wrote:
| >> The tape was the only backup for those things, and it
| completes Frogger 2's development archives, which will be
| released publicly.
|
| In cases like this can imagine some company yelling "copyright
| infringement" even though they don't possess a copy themselves.
| It's a really odd situation.
| chrisstanchak wrote:
| I've been suffering through something similar with a DLT IV tape
| from 1999. Luckily I didn't send out to the data recovery
| company. But still unsuccessful.
| db48x wrote:
| Wow, that backup software sounds like garbage. Why not just use
| tar? Why would anyone reinvent that wheel?
| robotnikman wrote:
| The company that made it probably was hoping for vendor lock-in
| cosmotic wrote:
| Vendor lock in for backup and archival products is so
| ridiculous. It increases R&D to ensure the lock-in, and the
| company won't exist by the time the lock-in takes effect.
| fifteen1506 wrote:
| Well yes, but the boss probably is willing to invest more
| money (meaning higher salaries, more people, better tools)
| expecting a future return than when using reasonable
| formats.
| giantrobot wrote:
| IIRC tar has some Unixisms that don't necessarily work for
| Windows/NTFS. Not saying reinventing tar is appropriate but
| there's Windows/NTFS that a Windows based tape backup need to
| support.
| cosmotic wrote:
| Most of what makes NTFS different than FAT probably doesn't
| need to be backed up. Complex ACLs, alternative data streams,
| shadow copies, etc, are largely irrelevant when it comes to
| making a backup. Just a simple warning "The data being backed
| up includes alternative data streams. These aren't supported
| and won't be included in the backup" would suffice.
| jandrese wrote:
| All of that stuff matters when you're using the backup for
| its intended purpose: to restore a system after hardware
| failure.
|
| Unix tar is obviously not the right solution, but a Windows
| tar seems like it shouldn't be that hard to do and yet we
| are in the situation we are today. I've been using
| dump/restore for decades now on Unix, including to actually
| recover from loss, but I admit that it's not that pleasant
| to use. I like that it is very simple and reliable however,
| unlike the mess that is Time Machine (recovering from a
| hardware loss on a Mac is a roll of the dice, and I've
| gotten snakes) or worse Deja Dup. I'm not sure I've ever
| successfully recovered a system from a Deja Dup backup.
| a1369209993 wrote:
| > using the backup for its intended purpose: to restore a
| system after hardware failure.
|
| No. The intended purpose of a backup is to restore the
| _data_ (such as the Frogger 2 source code) after a
| hardware failure. If it has the side effect of also
| producing a working system, that 's _good_ , but it's not
| the point. After all, the hardware necessary to build a
| working system may not exist any more; one (only-probably
| not the last) instance of said hardware just broke, after
| all.
| cosmotic wrote:
| I think the use case for disaster recovery is a bit
| different than long-term archival.
| nycdotnet wrote:
| If you're backing up a db or something sure, but for a file
| server this can be just as important as the data itself
| (ex: now everyone can read HR's personnel files which had
| strict permissions before)
| ilyt wrote:
| The format is extensible enough that it could be added
| bombcar wrote:
| The world of tape backup was (is?) absolutely filled with all
| sorts of vendor-lock in projects and tools. It's a complete
| mess.
|
| And even various versions of tar aren't compatible, and that's
| not even starting with star and friends.
| stepupmakeup wrote:
| It's not just limited to tape, most archiving and backup
| software is proprietary. It's impossible to open Acronis or
| Macrium Reflect images without their Windows software. In
| Acronis's case they even make it impossible to use offline or
| on a server OS without paying for a license. NTBackup is
| awfully slow and doesn't work past Vista, and it's not even
| part of XP POSReady for whatever reason, so I had to rip the
| exe from a XP ISO and unpack it (NTBACKUP._EX... I forgot
| microsoft's term for that) because the Vista version
| available on Microsoft's site specifically checks for
| longhorn or vista.
|
| Then there's slightly more obscure formats that didn't take
| off in the western world, and the physical mediums too. Not
| many people had the pleasure of having to extract hundreds of
| "GCA" files off of MO disks using obscure Japanese freeware
| from 2002. The English version of the software even has a
| bunch of flags on virustotal that the standard one doesn't.
| And there's obscure LZH compression algorithms that no tool
| available now can handle.
|
| I've found myself setting up one-time Windows 2000/XP VMs
| just to access backups made after 2000.
| jandrese wrote:
| I have at various times considered a tape backup solution for
| my home, but always give up when it seems every tape vendor
| is only interested in business clients. It was a race to stay
| ahead of hard drives and oftentimes they seemed to be losing.
| The price points were clearly aimed at business customers,
| especially on the larger capacity tapes. In the end I do
| backup to hard drives instead because it's much cheaper and
| faster.
| bombcar wrote:
| The only way to do tape at home is with used equipment and
| Linux/BSD. You can do quite a bit with tar and mt (iirc) -
| even controlling auto loaders.
|
| What's fun are the hard drive based systems designed to
| perfectly imitate a tape autoloader so you don't have to
| buy new backup software (virtual tape libraries).
| stepupmakeup wrote:
| Tape absolutely isn't viable for the consumer at all, but
| definitely worth exploring for the novelty. Even if you
| manage to get a pretty good deal on a legacy LTO system
| (other formats don't even come close to the tb/$ of 10+
| year old LTO and drives are still fairly cheap), the drives
| aren't being made any more and aren't getting any cheaper.
| Backwards compatibilty may be in your favor depending on
| your choice of tape generation at least, I think there's at
| least two generations guaranteed. Optical will probably
| remain king though the pricing is worse than HDDs, there's
| no shortage of DVD or BD readers, but you might run into
| issues with quad layer 128 BD as they only hit the market
| fairly recently.
| ilyt wrote:
| Tape drive and Bareos/Bacula "just works"
|
| Absolutely not worth it tho. Drives are hideously expensive
| which means they only start making sense where you have at
| least dozens of tapes.
|
| There is an advantage of tapes not being electrically
| connected most of the time so lightning strike will not
| burn your archives, I have pondered making a separate box
| with a bunch of hard drives that boots once a month and
| just copies last months of backups on hard drives, powered
| from solar or something just to separate from the network
| EvanAnderson wrote:
| ARCServe was a Computer Associates product. That's all you need
| to know.
|
| It had a great reputation on Novell Netware but the Windows
| product was a mess. I never had a piece of backup management
| software cause blue screens (e.g. kernel panics) before an
| unfortunate Customer introduced me to ARCServe on Windows.
| nycdotnet wrote:
| My favorite ArcServe bug which they released a patch for (and
| which didn't actually fix the issue, as I recall) had a KB
| article called something along the lines of "The Open
| Database Backup Agent for Lotus Notes Cannot Backup Open
| Databases".
| h2odragon wrote:
| Truly noble effort. Hopefully the writeup and the tools will save
| others much heartbreak.
| bsder wrote:
| Is there way to read magnetic tapes like these in such a way as
| to get the raw magnetic flux at high resolution?
|
| It seems like it would be easier to process old magnetic tapes by
| imaging them and then applying signal processing rather than
| finding working tape drives with functioning rollers. Most of the
| time, you're not worried about tape speed since you're just doing
| recovery read rather than read/write operations. So, a slow but
| accurate operation seems like it would be a boon for these kinds
| of things.
| fifteen1506 wrote:
| You still need to know where to look, the format, and using
| specialized equipment which cost wasn't driven down by mass
| manufacturing, so, in theory yes, in practice not.
|
| (Completely guessing here with absolute no knowledge of the
| real state of things)
| EvanAnderson wrote:
| For anybody who is into this this is a a good excuse to share a
| presentation from Vintage Computer Fest West 2020 re: magnetic
| tape restoration: https://www.youtube.com/watch?v=sKvwjYwvN2U
|
| The presentation explores using software-defined signal
| processing analyze a digitized version of the analog signal
| generated from the flux transitions. It's basically moving the
| digital portion of the tape drive into software (a lot like
| software-defined radio). This is also very similar to efforts
| in floppy disk preservation. Floppies are amazingly like tape
| drives, just with tiny circular tapes.
| bombcar wrote:
| Yes. There's some guy on YouTube who does stuff like that (he
| reverse engineered the audio recordings from a 747 tape array)
| but it can be quite complicated.
| Nextgrid wrote:
| Would you have a link by any chance? Thanks!
| iforgotpassword wrote:
| Sounds like at least in this case that ASIC in the drive was
| doing some (non trivial) signal processing. Would be
| interesting to know how hard it would be to get from the flux
| pattern back to zeros and ones. I guess with a working drive
| you can at least write as many test patterns as you want until
| you maybe figure it out.
| jandrese wrote:
| At the very least the drive needs to be able to lock onto the
| signal. It's probably encoded in a helix on the drive and if
| the head isn't synchronized properly you won't get anything
| useful, even with a high sampling rate.
| dpratt wrote:
| At the very least, and the cost for this perhaps would be
| prohibitive, but some mechanism to duplicate the raw flux off
| the tape onto another tape in an identical format, a backup of
| the backup. This would allow for attempts to read the data that
| may be potentially destructive to the media (for example,
| breaking the tape accidentally) and not lose the original
| signal.
| tombert wrote:
| This is giving me some anxiety about my tape backups.
|
| I have backed up my blu-ray collection to a dozen or so LTO-6
| tapes, and it's worked great, but I have no idea how long the
| drives are going to last for, and how easy it will be to repair
| them either.
|
| Granted, the LTO format is probably one of the more popular
| formats, but articles like this still keep me up at night.
| bombcar wrote:
| Do test restores. LTO is very good but without verification
| some will fail at some point.
|
| But your original bluray disk are _also_ a backup.
| EvanAnderson wrote:
| The only surefire method to keep the bits readable is to
| continue moving them onto new media every few years. Data has a
| built-in recurring cost. I'd love to see a solution to that
| problem but I think it's unlikely. It's a least possible,
| though, that we'll come up with a storage medium with
| sufficient density and durability that'll it'll be good enough.
|
| I don't even want to think about the hairy issues associated
| with keeping the bits able to be interpreted. That's a human
| behavior problem more than a technology problem.
| wazoox wrote:
| LTO-7 drives read LTO-6, and will be available for quite a
| while.
|
| In 2016 I've used an LTO-3 drive to restore a bunch (150 or
| 200) of LTO-1/2 tapes from 2000-2003, and almost all but one or
| two worked fine.
| robotnikman wrote:
| I've always admired the tenacity of people who reverse engineer
| stuff. To be able to spend multiple months figuring out barely
| documented technologies with no promise of success takes a lot a
| willpower and discipline. It's something I wish I could improve
| more in myself.
| detrites wrote:
| I think you could. In some sense "easily". It may be about
| finding _that thing_ you 're naturally so interested in or
| otherwise drawn to, that the months figuring out become a type
| of driven joy, and so the willpower kinda automatic.
|
| And if you find it, don't judge what it is or worry what others
| might think - or even necessarily tell anyone. Sometimes the
| most motivating things are highly personal, as with the OP; a
| significant part of their childhood.
| masto wrote:
| This brings back (unpleasant) memories. I remember trying to get
| those tape drives working with FreeBSD back in 1999, and it going
| nowhere.
| ilamont wrote:
| In The Singularity Is Near (2005) Ray Kurzweil discussed an idea
| for the "Document Image and Storage Invention", or DAISI for
| short, but concluded it wouldn't work out. I interviewed him a
| few years later about this and here's what he said:
|
| _The big challenge, which I think is actually important almost
| philosophical challenge -- it might sound like a dull issue, like
| how do you format a database, so you can retrieve information,
| that sounds pretty technical. The real key issue is that software
| formats are constantly changing.
|
| People say, "well, gee, if we could backup our brains," and I
| talk about how that will be feasible some decades from now. Then
| the digital version of you could be immortal, but software
| doesn't live forever, in fact it doesn't live very long at all if
| you don't care about it if you don't continually update it to new
| formats.
|
| Try going back 20 years to some old formats, some old programming
| language. Try resuscitating some information on some PDP1
| magnetic tapes. I mean even if you could get the hardware to
| work, the software formats are completely alien and [using] a
| different operating system and nobody is there to support these
| formats anymore. And that continues. There is this continual
| change in how that information is formatted.
|
| I think this is actually fundamentally a philosophical issue. I
| don't think there's any technical solution to it. Information
| actually will die if you don't continually update it. Which
| means, it will die if you don't care about it. ...
|
| We do use standard formats, and the standard formats are
| continually changed, and the formats are not always backwards
| compatible. It's a nice goal, but it actually doesn't work.
|
| I have in fact electronic information that in fact goes back
| through many different computer systems. Some of it now I cannot
| access. In theory I could, or with enough effort, find people to
| decipher it, but it's not readily accessible. The more backwards
| you go, the more of a challenge it becomes.
|
| And despite the goal of maintaining standards, or maintaining
| forward compatibility, or backwards compatibility, it doesn't
| really work out that way. Maybe we will improve that. Hard
| documents are actually the easiest to access. Fairly crude
| technologies like microfilm or microfiche which basically has
| documents are very easy to access.
|
| So ironically, the most primitive formats are the ones that are
| easiest._
| ChuckMcM wrote:
| This is very very true. I have archived a number of books and
| magazines that were scanned and converted into "simplified"
| PDF, and archived on a DVD disks with C source code.
|
| There are external dependencies but one hopes that the
| descriptions are sufficient to figure out how to make those
| work.
| magpi3 wrote:
| One of the claimed benefits of the JVM (and obviously later
| VMs) was that it would solve this issue: Java programs written
| in 2000 should still be able to run in 2100. And as far as I
| know the JVM has continued to fulfill this promise.
|
| An honest question: If you are writing a program that you want
| to survive for 100+ years, shouldn't you specifically target a
| well-maintained and well-documented VM that has backward
| compatibility as a top priority? What other options are there?
| wongarsu wrote:
| In 2005 the computing world was much more in flux than it is
| now.
|
| PNG is 26 years old and basically unchanged since then. Same
| with 30 year old JPEG, or for those with more advanced needs
| the 36 year old TIFF (though there is a newer 21 year old
| revision). All three have stood the test of time against
| countless technologically superior formats by virtue of their
| ubiquity and the value of interoperability. The same could be
| said about 34 year old zip or 30 year old gzip. For executable
| code, the wine-supported subset of PE/WIN32 seems to be with us
| for the foreseeable future, even as Windows slowly drops
| comparability.
|
| The latest Office365 Word version still supports opening Word97
| files as well as the slightly older WordPerfect 5 files, not to
| mention 36 year old RTF files. HTML1.0 is 30 years old and is
| still supported by modern browsers. PDF has also got constant
| updates, but I suspect 29 year old PDF files would still
| display fine.
|
| In 2005 you could look back 15 years and see a completely
| different computing landscape with different file formats. Look
| back 15 years today and not that much changed. Lots of exciting
| new competitors as always (webp, avif, zstd) but only time will
| tell whether they will earn a place among the others or go the
| way of JPEG2000 and RAR. But if you store something today in a
| format that's survived the last 25 years, you have good chances
| to still be able to open it in common software 50 years down
| the line.
| forgotmypw17 wrote:
| There is something called Lindy Effect, which states that a
| format's longevity is proportional to its current age.
|
| I try to take advantage of this by only using older, open,
| and free things (or the most stable subsets of them) in my
| "stack".
|
| For example, I stick to HTML that works across 20+ years of
| mainstream browsers.
| orbital-decay wrote:
| This is too shortsighted by the archival standards. Even Word
| itself doesn't offer full compatibility. VB? 3rd party active
| components? Other Office software integration? It's a mess.
| HTML and other web formats are only readable by the virtue of
| being constantly evolved while keeping the backwards
| compatibility, which is nowhere near complete and is
| hardware-dependent (e.g. aspect ratios, colors, pixel
| densities). The standards _will_ be pruned sooner or later,
| due to the tech debt or being sidestepped by something else.
| And I 'm pretty sure there are plenty of obscure PDF features
| that will prevent many documents from being readable in mere
| half a century. I'm not even starting on the code and
| binaries. And cloud storage is simply extremely volatile by
| nature.
|
| Even 50 years (laughable for a clay tablet) is still pretty
| darn long in the tech world. We'll still probably see the
| entire computing landscape, including the underlying
| hardware, changing fundamentally in 50 years.
|
| Future-proofing anything is a completely different dimension.
| You have to provide the independent way to bootstrap, without
| relying on the unbroken chain of software standards,
| business/legal entities, and the public demand in certain
| hardware platforms/architectures. This is unfeasible for the
| vast majority of knowledge/artifacts, so you also have to
| have a good mechanism to separate signal from noise and to
| transform volatile formats like JPEG or machine-executable
| code into more or less future proof representations, at least
| basic descriptions of what the notable thing did and what
| impact it had.
| ilyt wrote:
| >Future-proofing anything is a completely different
| dimension. You have to provide the independent way to
| bootstrap, without relying on the unbroken chain of
| software standards, business/legal entities, and the public
| demand in certain hardware platforms/architectures. This is
| unfeasible for the vast majority of knowledge/artifacts, so
| you also have to have a good mechanism to separate signal
| from noise and to transform volatile formats like JPEG or
| machine-executable code into more or less future proof
| representations, at least basic descriptions of what the
| notable thing did and what impact it had.
|
| I'd argue that best way would be to not do that but to make
| sure format is ubiquitous enough that the knowledge will
| never be lost in the first place.
| moron4hire wrote:
| While it's true that these standards are X years old, the
| software that encoded those formats yesteryear is very
| different from the software that decodes it today. It's a
| Ship of Theseus problem. They can claim an unbroken lineage
| since the distant future, the year 2000, but encoders and
| decoders had defects and opinions that were relied on--both
| intentionally and unintentionally--that are different from
| the defects and opinions of today.
|
| I have JPEGs and MP3s from 20 years ago that don't open
| today.
| matja wrote:
| Are they really JPEGs and MP3s, or just bitrot?
|
| I've found https://github.com/ImpulseAdventure/JPEGsnoop
| useful to fix corruption but I haven't come across a non-
| standard JFIF JPEG unless it was intentionally designed to
| accommodate non-standard features (alpha channel etc).
| orbital-decay wrote:
| I personally never encountered JPEGs or MP3s which were
| totally unreadable due to the being encoded by ancient
| software versions, but the metadata in common media
| formats is a total mess. Cameras and encoders are writing
| all sorts of obscure proprietary tags, or even things
| like X-Ray (STALKER Shadow of Chernobyl game engine)
| keeping gameplay-relevant binary metadata in OGG Vorbis
| comments. Which is even technically compliant with the
| standard I think, but that won't help you much.
| ilyt wrote:
| Actually I'd argue it's wrong precisely because we _do_ manage
| to retrieve even such old artifacts. Only problem is that
| nobody cared for 30 years so the process was harder than it
| should be but in the end it was possible.
|
| Sure, there is a risk that at some point, for example, any
| version of every PNG or H.264 decoder gets lost and so re-
| creating decoder for that would be significantly more
| complicated, but chances for that are pretty slim, but looking
| at `ffmpeg -codecs` I'm not really worried for that to ever
| happen.
| krapp wrote:
| I'm certain that 100 years from now, when the collapse really
| gets rolling, we'll still have cuneiform clay tablets
| complaining about Ea-Nassir's shitty copper but most of the
| digital information and culture we've created and tried to
| archive will be lost forever. Eventually, we're going to lose
| the infrastructure and knowledge base we need to keep updating
| everything, people will be too busy just trying to find food
| and fighting off mutants from the badlands to care.
| jakeinspace wrote:
| Well, almost all early tablets are destroyed or otherwise
| lost now. Do you think we will lose virtually all digital age
| information within a century? Maybe from a massive CME, I
| suppose.
| jcranmer wrote:
| Clay tablets were usually used for temporary records, as
| you could erase it simply by smearing the clay a little bit
| (a lot easier than writing with on papyrus). The tablets we
| have exist because of something that causes the clay to be
| baked into ceramic, which is generally some sort of
| catastrophic fire that caused the records to accidentally
| be preserved for much longer.
| krapp wrote:
| I can see it happening. Not as a single catastrophic event
| but, like Rome falling bit by bit, our technological
| civilization fails and degenerates as climate change (in
| the worst possible scenario) wreaks havoc on everything.
| 0xdeadbeefbabe wrote:
| > Hard documents are actually the easiest to access. Fairly
| crude technologies like microfilm or microfiche which basically
| has documents are very easy to access.
|
| Maybe it isn't crude after all if it wins.
| hello_computer wrote:
| I was able to backup/restore an old COBOL system via cpio
| between modern GNU cpio (man page last updated June 2018), and
| SCO's cpio (c. 1989). This is neither to affirm nor contradict
| Kurzweil, but rather to praise the GNU userland for its solid
| legacy support.
| crazygringo wrote:
| But he seems to have written this before virtual machines
| became widespread.
|
| I think the concern is becoming increasingly irrelevant now,
| because if I really need to access a file I created in Word 4.0
| for the Mac back in 1990, it's not too hard to fire up System 6
| with that version of Word and read my file. In fact it's much
| _easier_ now than it was in 2005 when he was writing. Sure it
| might take half an hour to get it all working, but that 's
| really not too bad.
|
| Most of this is probably technically illegal and will sometimes
| even have to rely on cracked versions, but also nobody cares
| and. All the OS's and programs are still around and easy to
| find on the internet.
|
| Not to mention that while file formats changed all the time
| early on, these days they're remarkably long-lived -- used for
| decades, not years.
|
| The outdated hardware concern _was_ more of a concern (as the
| original post illustrates), but so much of everything important
| we create today is in the cloud. It 's ultimately being saved
| in redundant copies on something like S3 or Dropbox or Drive or
| similar, that are kept up to date. As older hardware dies, the
| bits are moved to newer hardware without the user even knowing.
|
| So the problem Kurzweil talked about has basically become
| _less_ of an issue as time has marched on, not _more_. Which is
| kind of nice!
| ilyt wrote:
| >I think the concern is becoming increasingly irrelevant now,
| because if I really need to access a file I created in Word
| 4.0 for the Mac back in 1990, it's not too hard to fire up
| System 6 with that version of Word and read my file. In fact
| it's much easier now than it was in 2005 when he was writing.
| Sure it might take half an hour to get it all working, but
| that's really not too bad.
|
| And that was easy years ago.
|
| Now you can WASM it and run it in a browser
| xigency wrote:
| As a kid, I got this game as a gift and really, really wanted to
| play it. But after beating the second level, the game would
| always crash on my computer with an Illegal Operation exception.
| I remember sending a crash report to the developer, and even
| updating the computer, but I never got it working.
| jakeinspace wrote:
| I adored this game as a kid, and I think I do have a faint
| memory of some stability issues, but I believe I was able to
| beat the game.
| bluedino wrote:
| This will be fun in 20 years, trying recover 'cloud' backups from
| servers found in some warehouse.
| ilyt wrote:
| Nah it will be very simple:
|
| ....What do you mean "nobody paid for the bucket for last 5
| years" ?
|
| There is some chance someone might stash old hard drive or tape
| with backup somewhere in the closet. There is no chance there
| will be anything left when someone stops paying for cloud.
| PicassoCTs wrote:
| The author has fantastic endurance, what a marathon to get the
| files of the tape.
___________________________________________________________________
(page generated 2023-05-24 23:00 UTC)