[HN Gopher] Original WWW proposal is a Word for Macintosh 4 file...
___________________________________________________________________
Original WWW proposal is a Word for Macintosh 4 file from 1990, can
we open it?
Author : jgrahamc
Score : 313 points
Date : 2024-02-13 14:06 UTC (8 hours ago)
(HTM) web link (blog.jgc.org)
(TXT) w3m dump (blog.jgc.org)
| zdw wrote:
| If you wanted exactly what would have been printed, on the
| emulator running Word for Mac 4.0 you should be able to install a
| print queue that can generate a .ps (Postscript) file, which
| would could be converted to PDF.
|
| Or Acrobat may be available for that old of an OS and would have
| a virtual print driver to go directly to PDF.
| detourdog wrote:
| I know I have running Macs with Word 5.1a which I consider the
| last Word version needed. I'm sure I opened Word 4.0 files.
| kps wrote:
| Yes, a few years ago I helped a friend recover a bunch of old
| documents. The solution was to use Mac Word 5 to open the
| Word 4 files and save them as something newer versions could
| read.
| jgrahamc wrote:
| Ah. Great suggestion! I just used Print2PDF to make a PDF from
| Word. Will update the blog.
| chrisfinazzo wrote:
| https://web.mit.edu/ghostscript/www/Ps2pdf.htm
|
| _Or, if you prefer to do more tweaking yourself, dive into the
| Ghostscript deep end :)_
|
| https://www.ghostscript.com
| msephton wrote:
| LibreOffice opens it right up. It's support for old document file
| formats is really excellent. I keep it around for just this
| purpose. https://imgur.com/a/JENgq6V
|
| But I also love using BasiliskII and InfiniteMac emulators!
| Karellen wrote:
| > LibreOffice opens it right up. It's support for old document
| formats is really excellent.
|
| Yes, the OP also mentions that LibreOffice opens it.
|
| ...but they also point out with LibreOffice that "Although
| there's something weird about the margins and there are other
| formatting problems." - which is also apparent in your
| screenshot? Certainly that level support for such an old
| proprietary format is pretty good, but I'm not sure I'd class
| it as "really excellent" with those issues.
| jgrahamc wrote:
| Yes, LibreOffice opened it right up with the wrong font
| sizes, headers and footers messed up, incorrect gutter and
| margins, and a bunch of other problems. But they were all
| fixable.
| msephton wrote:
| I should have been clearer: what I meant was that its support
| for very many different old document formats is excellent.
| Atari ST, Amiga, Macintosh, and so on. The OP and you are
| quite right that it won't open the documents with exactly the
| right formatting, but it's good enough in a pinch so you
| don't have to learn how to use 40 year old computers. It's a
| good tool to have.
|
| 7zip has similar support for a wide range of compressed file
| formats, exes, data files, cabinets, and so on. Another good
| tool to save time and keep you on your modern operating
| system.
| sigspec wrote:
| Yeah we read the article--- which matches your screenshot.
| msephton wrote:
| This is for all the TL;DR folks.
| jgrahamc wrote:
| I think your summary is a bit short. Sure, LibreOffice
| opens the file but there are multiple problems with the
| formatting that need correcting. Your screenshot shows at
| least one of them (there shouldn't be any headers on the
| first page and the page layout should be different).
| graemep wrote:
| LibreOffice was the first thing I tried, and it worked with no
| problem.
| jgrahamc wrote:
| Well, except for all the problems I outlined in the post.
| soperj wrote:
| headline says "open" and libreoffice opened it with no
| problem.
| TaylorAlexander wrote:
| I simply opened the file with my hex editor. Problem
| solved. (sarcasm)
| jgrahamc wrote:
| I actually opened it in emacs in hexl-mode before I ran
| the file command!
| skissane wrote:
| In the past, I have in all seriousness read Microsoft
| Word documents on Linux using less. I might have had
| LibreOffice installed, but it can't run over SSH.
|
| It works okay with most old school (pre-XML) ones, since
| the document text is in the file in plain ASCII amidst
| all the binary formatting stuff. For the new XML formats,
| less by itself doesn't do anything useful, but unzip them
| and you can read the XML containing the document text.
| lizknope wrote:
| Yeah, I stopped reading the article, downloaded the file, the
| only word processor is in Libre Office. It seemed to work fine
| so I didn't know what the issue was. Then I read the article
| and kept scrolling to the end where the author finally uses
| LibreOffice and it opens mostly okay.
| vdaea wrote:
| So does Word 2019 for Windows.
| jgrahamc wrote:
| Is the formatting correct? Are the images visible? Because
| others report (see other comments) that Word opens the file
| but the images are missing. See the Word generated PDF here:
| https://news.ycombinator.com/item?id=39359079
| vdaea wrote:
| Yes, you are right, apologies. I thought it wouldn't open
| at all, like in the screenshot in that blog post.
| ogurechny wrote:
| Well, StarOffice already existed back then. Now I wonder
| whether LibreOffice still has some early '90s third party
| format parsing code inside, or some reverse engineered
| compatibility and conversion code from much later Word version
| actually does the job.
| arnaudsm wrote:
| Great cautionary tale about how quickly formats get obsolete,
| especially closed source ones.
|
| I use markdown, plaintext and png for all the documents I need to
| store long term.
|
| Even if these formats disappear, I could trivially reimplement my
| own parser.
| ComputerGuru wrote:
| Isn't markdown plaintext? (I didn't downvote.)
| williamcotton wrote:
| Isn't HTML plaintext?
|
| ;)
| ComputerGuru wrote:
| Yes, but not intended to be directly human readable by
| contrast.
| Narishma wrote:
| If it wasn't intended to be human readable it would have
| been a binary format.
| robinsonb5 wrote:
| It may have been intended to be human readable, but it
| failed dismally in that goal.
|
| Even before the web turned into the javascript infested
| swamp that is now, the tags having the same visual weight
| as the text they enclose made it tiring to read.
|
| Markdown's genius is in the formatting tags being almost
| no hindrance to readability.
| williamcotton wrote:
| I definitely agree that Markdown is more readable than
| markup, but personally I abhor what some frameworks do to
| HTML. I make sure my HTML is legible! There is even a
| benefit when it comes to hyperlinks in that you can _see_
| the URL!
| elzbardico wrote:
| As a society we should have been thinking more about digital
| preservation since the time we started eschewing archiving hard
| copies in paper.
|
| People who don't know history are doomed to repeat it, but how
| can our future generations learn from our mistakes if all our
| documents are unreadable or lost by their time?
| zokier wrote:
| Are you just casually dismissing all the work that digital
| archivists have done over the past couple of decades?
|
| https://www.loc.gov/librarians/standards
|
| https://www.loc.gov/preservation/digital/
|
| https://www.loc.gov/programs/digital-collections-
| management/...
|
| and that's just Library of Congress, they are hardly alone in
| this field
| kragen wrote:
| implementing a markdown parser is far from trivial
|
| implementing a parser that tricks people into believing it
| parses markdown because it acts like a markdown parser in
| simple cases is what is trivial
|
| it's likely that your markdown data will indeed be recoverable,
| but if you're generating it yourself, html is probably safer
| arnaudsm wrote:
| Parsing markdown is multiple orders of magnitude easier than
| Microsoft Word, especially before docx.
|
| And it has the merit to be human-readable in plaintext!
| kragen wrote:
| that's probably true
| jprete wrote:
| But the Markdown document doesn't actually need a parser to
| still be usable. Markdown as a whole imitates the conventions
| of typed text. The table formats would even be usable on an
| old typewriter.
| kragen wrote:
| markdown doesn't have tables, although you can include html
| <table> tags in it. perhaps you mean
| indented fixed-width blocks you can use for
| ascii art or typewriter-style tables?
| kelnos wrote:
| Sure it does. It may not be in the original standard, but
| many/most parsers support tables that use pipe characters
| to separate columns.
|
| And regardless, markdown documents -- including the table
| extension -- are readable without a parser.
| kragen wrote:
| extensions to markdown aren't markdown; that's why
| commonmark is called commonmark
|
| not being able to tell which variant of a language is in
| use is one of the biggest problems for archival, and in
| particular various extensions to the microsoft word
| format (all made by the same company!) were what made
| jgc's archival work so difficult in this case
|
| language extensions are an especially bad problem when
| there's no extension mechanism--because sometimes a pipe
| is just a pipe. but unfortunately markdown's only
| extension mechanism is html
| samatman wrote:
| It's called CommonMark because Gruber insisted. Not
| because extensions to markdown aren't Markdown(r), which
| no one cares about, and not because it isn't markdown in
| the ways that matter.
|
| Ironically, his objection was to the idea of a single and
| rigorous standard, you'll note that Git-flavored markdown
| never drew his wrath. And yet you're treating him and
| Swartz's implementation as if it was such a standard.
| Which it is not.
| zilti wrote:
| Or org-mode format. Then you even get tables properly.
| samatman wrote:
| The (only) issue is that Markdown isn't a format, it's a
| loose family of formats with many extensions. Implementing a
| parser Commonmark is not an especially difficult task in the
| grand scheme of things, it's quite well specified and has an
| extensive test suite.
|
| Although I find myself wondering what this "parsing Markdown"
| business is even about. It's perfectly legible as plain text,
| that was the main design principle behind it. If the goal is
| to have your data accessible in future, if you can read it
| now, and you don't go blind, you'll be able to read it later
| as well.
| inopinatus wrote:
| strictly speaking, markdown is a superset of html
| dzdt wrote:
| Somehow the author doesnt recognize that emulation is a
| legitimate answer to this question. Yes he was able to open the
| document, by using the original software on a highly accurate
| emulation of the original system. Everything beyond that point is
| a different question: can we get it inside of a modern word
| processor.
| jgrahamc wrote:
| Sort of. What I wanted was to be able to get a PDF version of
| it. I was hoping that a modern word processor would read the
| file format, and LibreOffice did. But it's also true that using
| emulation I was able to get a PDF (albeit one that has
| different fonts).
| nextaccountic wrote:
| > it's also true that using emulation I was able to get a PDF
| (albeit one that has different fonts).
|
| Maybe you needed to have the right fonts installed in your
| emulated mac? Another comment in this thread pointed out this
| londons_explore wrote:
| Emulation is starting to get gaps too... for example, running
| Windows 95 in an emulator on a modern machine is getting harder
| and harder (emulators like vmware and virtualbox don't emulate
| the CPU speed accurately, which causes the system not to boot,
| and they also don't emulate various paging behaviours of old
| intel CPU's accurately which causes windows applications to
| crash within a few seconds of starting).
|
| There are binary patches to windows 95 to fix these issues, but
| as the system gets older it's less likely people will put
| effort into binary patching it for compatibility with modern
| systems. And if it were more obscure, you'd be SOL.
| fourfour3 wrote:
| Whole system emulation like 86box does a much better job of
| emulating older hardware and OSes - I use it quite a bit for
| DOS/Win3.11/Win9x era stuff.
| thawkth wrote:
| PCem is far, far better for Win95 emulation - it can handle a
| P2 233 and a Voodoo3 fairly accurately - and tons and tons of
| hardware on top of that.
|
| It's amazing. I keep a 95 / 98 and some other vintage
| machines around as a hobby, but being able to play Unreal in
| an emulator with 3D acceleration blows my mind
| Narishma wrote:
| Those are virtual machines, not emulators. If you use a
| proper emulator like PCem or 86box, Windows 95 works fine.
| thaumasiotes wrote:
| > running Windows 95 in an emulator on a modern machine is
| getting harder and harder (emulators like vmware and
| virtualbox don't emulate the CPU speed accurately, which
| causes the system not to boot, and they also don't emulate
| various paging behaviours of old intel CPU's accurately which
| causes windows applications to crash within a few seconds of
| starting)
|
| I thought the normal way to run Windows 95 was in dosbox?
| markus92 wrote:
| As a testament to Microsoft's backwards compatibility: the file
| opened mostly fine in the Windows version of Word (version 2401),
| and the layout seems to be identical to the PDF of the article.
| It did block the file format by default but that was easy enough
| to allow.
|
| The graphics did not open however, due to a missing graphics
| filter for the Microsoft Word Picture format. Seem it's been
| deprecated for a while now but Word 2003 should be able to open
| it? Which is old, but not _that_ old not to run on modern
| systems.
| markus92 wrote:
| Installed a copy of Word 2003, document opened flawlessly
| immediately with default settings. Saving it from there
| converted it to a modern .doc which I could open with Office
| 365 and convert to PDF etc.
|
| I think the moral of the story is that the Windows Office team
| seems to spend a bit more time on backwards compatibility.
| jgrahamc wrote:
| I would be interested to see a PDF generated from Office 365
| to understand how flawless it really is.
| zokier wrote:
| Here you go, exported from desktop Word to PDF.
|
| https://drive.google.com/file/d/1lnaSr22l3kQbmFHnxg3Ggd3-46
| v...
|
| Full version string:
|
| Microsoft(r) Word for Microsoft 365 MSO (Version 2311 Build
| 16.0.17029.20140) 64-bit
| jgrahamc wrote:
| Right. So all the images are missing. LibreOffice still
| gives the best conversion I think.
| markus92 wrote:
| Yeah, that's why you need Word 2003 for the images, it's
| a deprecated format full of security holes I guess.
| giancarlostoro wrote:
| Ah... yeah I was wondering why they would deprecate an
| image format at all. My understanding is that Word in the
| old days serialized what was in memory, maybe that was a
| little too exploitable with images?
|
| Not sure just curious not even sure where to look that
| one up honestly.
| zokier wrote:
| Digging through the files a bit I think the images are in
| PICT format which is very specific to Macs (the original
| ones). Its not surprising that modern Word doesn't
| support those that well as they are actually somewhat
| complicated kinda-vector image format. I am surprised
| that even Word 2003 implemented PICT on Windows.
| zokier wrote:
| Probably the completely best would be to use LO for the
| images and Word otherwise... needs some manual twiddling
| but I suspect that way you can get pretty much perfect
| layout and images.
| ogurechny wrote:
| Office applications up to (and probably including) version
| 2010 break and crash on latest Windows versions. That
| behavior varies based on Office service packs and updates
| installed. You were lucky to be able to just _save the
| document_.
|
| Unless, of course, you've found some _portable version_ on
| the net that packs ThinApp and an assortment of old system
| libraries under the hood.
| crazygringo wrote:
| I'm surprised he didn't try an intermediate version of Word --
| not the original Word 4.0 for Mac, but not the current online
| version of Word either.
|
| I had a lot of old Word 4.0 for Mac files at one point, and
| remember some point in the late 1990's or early 2000's opening
| them all up in a version of Word for Windows, and then re-saving
| them in a more up-to-date Word format. I believe there was an
| official converter tool Microsoft provided as a free add-on or an
| optional install component -- it wouldn't open the "ancient" Word
| formats otherwise.
|
| There's definitely going to be a chain here of 1 or 2
| intermediate versions of Word that should be able to open the
| document perfectly and get it into a modern Word format, I should
| think -- and I'm curious what the exact versions _are_. (Although
| as other people point out, if you don 't need to edit it, then
| exporting it as PostScript in Word 4.0 and converting it to PDF
| works fine too.)
| elzbardico wrote:
| I am deeply disappointed that a company like Microsoft doesn't
| make a point of Microsoft Word being able to open any document
| created by any version of Word, no matter how ancient it is. I
| think they have the social/historical/economical responsibility
| of doing so.
|
| If they are worried about vulnerabilities in the old parsing
| code, move it to an external process, run it under isolation in a
| sandbox to spit out a newer readable version on the fly, but
| don't eliminate this capability from the software.
|
| EDIT: zokier pointed out to me that the desktop version of Word
| opens the file fine, it is only the web version that doesn't. So,
| consider this post void.
|
| EDIT 2: Well it opens the document, but is not able to display or
| print the embedded graphics, it seems.
| OJFord wrote:
| You don't have to go _anywhere near_ 1990 to find issues with
| modern Microsoft (especially cloud) apps opening documents
| created in older ones!
| zokier wrote:
| You missed the fact that the real Word does open this file just
| fine, its just the toy web version that has issues (and maybe
| Mac too but eh)
| elzbardico wrote:
| Oh, really? I stand corrected. Thanks for pointing this out.
| jgrahamc wrote:
| No, you're not wrong, another commenter points out that
| latest Word opens the document but doesn't display the
| graphics.
| ben7799 wrote:
| The Office 365 Mac version refuses to open it.
|
| You can recover text but the result is horrible. No graphics
| and all formatting lost.
| jgrahamc wrote:
| Yes, it opens it and throws away the graphics, so not "just
| fine".
| zokier wrote:
| If we go into splitting hairs, it doesn't really throw the
| graphics away, it simply lacks the "filter" to display them
| but they are there still, as in it recognizes the graphics
| object correctly and lays out it on the page. Based on the
| error message, hypothetically I suppose you could even make
| a custom filter to handle the object.
|
| But this really goes more into the facet of Office files
| that allowed embedding pretty much anything into them, and
| relying on this "filter" system (I guess OLE) to handle
| embedded objects. So while the DOC file itself is getting
| parsed and rendered pretty much perfectly, the embedded
| objects are another story.
|
| In the same sense I'd say browser might open some HTML page
| "fine" even if it doesn't know how to handle some image
| format that is used on the page; it'd still handles the
| HTML correctly.
| jdofaz wrote:
| Makes me wonder if the graphics are in PICT format
| zokier wrote:
| I think they are. You can even find some PICT files
| inside the ODT in the github from TFA
| petersmagnusson wrote:
| if you read the blog, the main point of OP's project was
| to get at the diagrams, so hardly "splitting hairs".
| nullindividual wrote:
| This is expected with the web versions of Office. They can
| read (certain) binary Office formats but not edit them. The
| web version of Office is designed for OpenXml file formats.
| nullindividual wrote:
| Old file formats have security vulnerabilities. The online
| version of Word is designed for docx only, although it can open
| certain binary documents.
| o11c wrote:
| Fundamentally, a data file format can't have vulnerabilities.
| At most it can be prone to vulnerabilities, but more often
| it's just that popular implementations are bad.
| nullindividual wrote:
| Sorry, the Word parser does and Microsoft did not feel it
| important enough to fix as their focus is on OpenXml
| formats.
| kelnos wrote:
| Then that's on Microsoft. There's no fundamental reason
| why a secure parser can't be written for old formats.
| nullindividual wrote:
| Why would Microsoft do that? It makes zero financial
| sense to continue with a parser that may need to be
| rewritten from scratch for a ~30 year old format.
| genewitch wrote:
| they can do what they want, and i'll continue on my 2
| decade long decision to never give microsoft money, for
| anything. Same way i'll never give propellerhead another
| dime, or Plex[0], or any of these other consumer-hostile
| companies.
|
| I don't trust MS to maintain software, even though as far
| as that goes, they're better than a lot of companies that
| have been writing software for decades. "time marches on"
| is silly when we have millions of times the compute,
| storage, and transit speeds available to us. I also don't
| see why people see the need to shill for multi-billion
| dollar companies.
|
| What microsoft should have done is trademark a new name
| for their word processor the second they made the
| decision to not open word .doc from older versions. That
| way there's no confusion.
|
| [0] having a hard time remembering the name/company of
| the software i purchased for in-house streaming over a
| decade ago. Plex is still a hassle to use for in-house
| streaming compared to the "service" or whatever they're
| selling. Unfortunately Synology seems to have grown weary
| of releasing a version of their client for every
| newfangled device that comes to market, so i'm stuck with
| plex on my TV; that is, unless i want to use a stick/set-
| top/computer attached to it.
| nullindividual wrote:
| > I don't trust MS to maintain software
|
| Then you should champion removal of any "old" software
| they have that is under maintenance-only status. You
| wouldn't want security vulnerabilities to go unfixed,
| would you?
|
| > What microsoft should have done is trademark a new name
| for their word processor the second they made the
| decision to not open word .doc from older versions. That
| way there's no confusion.
|
| That makes zero sense. Word is still Word. It performs
| the same tasks (and more) as Word 1.0 did.
|
| And Word today still reads/writes .doc, just not versions
| that are that old.
| kelnos wrote:
| No they don't. Parsers can have security vulnerabilities, but
| you can fix those, and there's little reason why a parser for
| an old format would have more vulnerabilities than for a new
| format. Some formats can also have certain (intended)
| features that have security implications, but parsers can
| choose to disable them if they are concerned.
| larsrc wrote:
| Many old formats were essentially just binary dumps of memory,
| or something not far removed. Documenting the formats was not a
| standard. Yes, I agree that there is a social responsibility,
| but having worked in digital archiving I can tell you that the
| olden days were really, really messy. No, really.
| resters wrote:
| This is the point that many of the commenters who criticize
| Microsoft are missing, and it's why the old formats are not
| enabled by default (security vulnerabilities) and why it's
| not as simple as creating a parser.
| autoexec wrote:
| Microsoft still deserves criticism for designing their old
| word formats so badly. It was a design choice to turn
| documents of mostly text into obscure binary formats that
| were badly standardized and maintained.
| resters wrote:
| Not true at all. Some of Microsoft's best minds created
| _extremely ingenious_ methods that allowed early word
| processors to be usable on files that were dramatically
| larger than what would fit in memory. OSes didn 't
| support suitable performance via VM infrastructure at the
| time. It was clever, outside of the box thinking that got
| MS to be able to beat WordPerfect (a worthy competitor)
| and the many other also-rans.
|
| There was (contrary to popular belief) not a deliberate
| strategy to limit interoperability. It was simply the
| reality of the approaches utilized that made them tightly
| coupled to the MS Word codebase and less standardizable
| than would have otherwise been ideal.
|
| Source: one of the guys who worked on it at MS.
| rietta wrote:
| Extremely interesting and thank you for doing this. I feel
| strongly that this goes to show just how important preserving
| historical software and emulation is. I have dabbled myself with
| old Windows 3.1 software for this very reason. We really, truly
| are going to have a period where web application driven software
| just disappears and we wont easily have this retro computing view
| of these decades in a short time from now.
| dfxm12 wrote:
| I also think it is important to show the importance of open
| formats or open source in general if we want future generations
| to read our documents or run/compile/understand our software.
| CharlesW wrote:
| _[silly pre-coffee post deleted]_
| jgrahamc wrote:
| Word is already available on the Infinite Mac as it's under
| Productivity inside the Infinite HD. No need to install it.
| whoopdedo wrote:
| > That way I can see actual fonts, font sizes and layout to
| confirm how the document should have looked.
|
| Or you would if you had the original fonts. Word 4.0 was released
| for System 6 with support as far back as System 3.2. Fonts at
| that time had separate screen and printer files for the different
| output resolutions. If you're missing the printer font it'll
| print a scaled (using nearest-neighbor) rendering of the screen
| font. If you're missing the screen font it'll substitute the
| system font. (Geneva by default, as seen in the screenshot.)
|
| In this case, only the well-known Palatino and Courier typefaces
| are needed. But LibreOffice substituted Times New Roman even
| though I have Palatino Linotype installed.
| jgrahamc wrote:
| That may go some way to explaining some of the differences I
| see, but the main thing I was looking for in the emulation was
| the font sizes.
| aidenn0 wrote:
| Doesn't the font matter almost as much as the font-size
| setting for font sizes, given that different font families
| can have wildly different metrics at the same font size?
| jgrahamc wrote:
| I bet it does. I should redo the final part after
| installing the required fonts.
| stuaxo wrote:
| This is good.
|
| It would be good to get some feature requests into libreoffice to
| fix the remaining mis-matches in the formatting.
| scaglio wrote:
| This rises a potential problem, often underrated by companies:
| some have backups with _infinite_ retention.
|
| It is common to have backups with retention of 10 years, some may
| have 20 years for legal reasons... but the majority of people
| don't understand the difference between "readable" and "usable".
|
| Of course, it depends on the data... And there are companies
| backing up whole _virtual machines_ with infinite retention,
| believing to be able to run them: it is hard enough to restore a
| vSphere 5.x machine on a brand new vSphere 8, I really don 't
| understand this waste of space.
| actionfromafar wrote:
| Often an old file or disk image is tiny compared to modern file
| sizes.
|
| So the waste of space is more of an administrative character
| than a waste of _disk_ space.
| rvnx wrote:
| If you backup all, you can sort later, and even eventually
| never. It costs 1 USD per month at Google Cloud to store 1TB of
| data.
|
| At this price it's not worth sorting, when one single devops
| costs 100 USD+ per hour, not including the opportunity cost of
| not working on something more productive (and less boring for
| the developer).
|
| Then X years after the company is acquired, or sufficient time
| has lapsed, you can delete / drop the data without sorting.
|
| Regarding virtual machines, if it's VMDK for example, you can
| read the raw disks without booting it, and again, it's not
| worth taking a risk to lose data to potentially save 10 USD per
| month, which is similar to one developer taking one beer extra
| at a team event.
| anonymouskimmer wrote:
| WordPerfect claims the ability to open MS Word 4.0 files. The
| standard edition is currently $175. I'm not buying it, but if
| you're willing to spend $175 it might be something to try.
| caboteria wrote:
| Yet another example of why Apache needs to take OpenOffice behind
| the barn.
| EasyMark wrote:
| You mean retire it to a nice farm upstate, little Jimmy might
| hear the shotgun blast!
| acheron wrote:
| "Here's a 4000 year old letter from a merchant to his partners
| describing how to avoid taxes by smuggling goods in their
| underwear." ( https://www.britishmuseum.org/blog/trade-and-
| contraband-anci... )
|
| vs
|
| "Not sure if it's possible to read this 30 year old file!"
| kelnos wrote:
| I get the point you're trying to make, but your former example
| is rare. While there are more exceedingly-old paper records
| that are still around and have been preserved than we might
| expect, we've lost so, so much. Paper and ink (and variations
| on that) are both fragile.
|
| Digital documents are otherwise easy to preserve indefinitely,
| if care is taken up-front to choose a simple document format
| that is likely to remain parseable (or at least documented) for
| a long time. And even when you don't do that, there's always
| the possibility of writing a parser later (assuming
| documentation is around) or reverse-engineering the format.
|
| And in this case, the 30-year-old file did end up getting
| opened, albeit not as trivially easily as one might hope.
| thaumasiotes wrote:
| > but your former example is rare. While there are more
| exceedingly-old paper records that are still around and have
| been preserved than we might expect, we've lost so, so much.
| Paper and ink (and variations on that) are both fragile.
|
| Depends what you mean by "rare". Ancient Near Eastern
| correspondence isn't rare at all, precisely because they
| didn't use paper. (And they went to war a lot.) You seem to
| be writing as if that letter was a paper document, but it
| isn't. Paper records that old only exist in Egypt.
|
| > Digital documents are otherwise easy to preserve
| indefinitely, if care is taken up-front to choose a simple
| document format that is likely to remain parseable (or at
| least documented) for a long time.
|
| This isn't a good match to the example either; Ancient Near
| Eastern records had to be deciphered. (The Semitic ones had
| to be deciphered. The Sumerian ones benefited from surviving
| documentation, but we had to find that and learn how to read
| it.)
|
| The original example isn't particularly apt; reading this
| 30-year-old file, or a similar one, is a task that one guy
| can do in less than a week using existing tools and know that
| he's done it correctly. Reading a 4000-year-old cuneiform
| letter was a much larger project than that.
| melomac wrote:
| I was able to download and transfer the proposal document to a
| Mini vMac emulator, set the Finder's type and creator to those of
| a Microsoft Word 5 document i.e. respectively WDBN and MSWD, and
| finally open the document with Microsoft Word 5 for Mac to export
| it as a RTF document.
|
| Here you have it: https://neko.melomac.net/tmp/proposal.rtf
|
| I certainly agree opening a document from this Macintosh era
| should be, by far, easier than the process I detailed below, but
| this is how it is -\\_(tsu)_/-
| jgrahamc wrote:
| Thanks. Unfortunately, the images are all missing.
| melomac wrote:
| It is even more frustrating that the image are in the
| document, and Microsoft Word for Mac would still display them
| accurately.
|
| And LibreOffice would display the images in the RTF document
| in a different size (a tiny block).
|
| If my old Mac display would work, I could have been able to
| send the document over to CUPS via Netatalk, and make a PDF
| out of it. Unfortunately Mini vMac can't connect to that VM
| on the LAN...
|
| Anyhow, it is scandalous that opening legacy documents became
| such a PITA.
| bluedino wrote:
| That Mac Word screenshot gives me claustrophobic flashbacks to
| trying to work on those tiny screens in middle school computer
| lab, writing science fair papers.
| cynicalsecurity wrote:
| It wasn't so bad. It's better now, but it was fine back then.
| whoopdedo wrote:
| I consider it more of not knowing how much better we could
| have had it. Small monitors were "normal." But I imagine
| people who got to work with the Portrait Display[1] (an
| impressive 640x870 resolution!) felt then as we do now when
| they had to switch back to the internal screen.
|
| [1] https://wiki.preterhuman.net/Apple_Macintosh_Portrait_Dis
| pla...
| retrac wrote:
| Heh, that screenshot is relatively high-resolution for the time
| in question, too. 800x600 maybe? The compact Macs were 512x342:
| https://www.betalogue.com/images/uploads/microsoft/pce-mac-w...
| (The toolbars, rulers, etc., could be hidden in the settings.)
| cranberryturkey wrote:
| libreoffice opened it.
| kelnos wrote:
| Sure, but the layout was screwed up and the fonts and sizes
| were wrong.
|
| Certainly this is helpful: it's better to be able to open a
| document and then have to manually fix those issues than to be
| unable to open it at all. But it was far from perfect.
| EasyMark wrote:
| It's orders of magnitude better than "I can't open this file
| at all, -1"?
| Sembiance wrote:
| This does an "okay" job at converting the document:
| https://archive.org/details/KeyViewPro
|
| Here is the converted PDF:
| https://smallpdf.com/result#r=091f20f23de353fac21376a3a49a60...
| jgrahamc wrote:
| Not sure that's really true. It did something but the images
| are a mess and a lot of formatting is gone. I think LibreOffice
| is still the winner here.
| bilsbie wrote:
| I wonder if it would be a viable business to keep running
| versions of computers going back say 40 years and offering to
| recover and convert files for people. (Just getting stuff off
| floppy disks and Zip drives might be useful)
| traceroute66 wrote:
| Interestingly, the latest and greatest version (desktop app via
| Office365) of Microsoft Word on Mac appears to know what it is
| _but_ refuses to open it.
|
| If you drag the file onto Word, it launches a dialogue box
| telling you "proposal uses a file type that is blocked from
| opening in this version" along with a link to the supporting page
| on the Microsoft website[1].
|
| [1] https://support.microsoft.com/en-us/office/error-filename-
| us...
| worik wrote:
| > telling you "proposal uses a file type that is blocked from
| opening in this version"
|
| "blocked"?
|
| That sounds like Microsoft has some IP problems with their old
| software.
| aidenn0 wrote:
| Normally I have good success with abiword, but it completely
| barfs on this file; it seems to be falling back on its RTF
| support.
| noufalibrahim wrote:
| One underappreciated (though mentioned) hero in this little saga
| is the venerable file(1) command. proposal:
| Microsoft Word for Macintosh 4.0
|
| It's so incredibly useful and so easily overlooked. I almost
| reflexively reach out to it when I'm curious about a file and the
| information it returns is just sufficient to satiate my curiosity
| and be useful.
| cpach wrote:
| I agree, _file_ is such a great tool.
|
| I have cursed so many times in the past when I sat in front of
| a work computer that ran Windows and didn't have this tool
| easily available. (Later on, WSL made life easier, but now I'm
| luckily nearly Windows-free.)
| AdamJacobMuller wrote:
| One might even say that file has a lot of magic in it.
| pdmccormick wrote:
| file has a lot of magic, but a file typically has only one
| magic.
| dorfsmay wrote:
| LibreOffice is amazing, beside being able to open many document
| formats, it can run headless and has command line options which
| allow automating some tasks such as converting format that would
| not be possible otherwise.
|
| https://help.libreoffice.org/latest/en-US/text/shared/guide/...
|
| https://opensource.com/article/21/3/libreoffice-command-line
| j45 wrote:
| https://www.ebay.com/itm/235033043066
|
| The original word for macOS software seems more than available.
| Dwedit wrote:
| Is there a way to make a PS or PDF file using the actual Word for
| Macintosh 4? I'd think that would be the definitive render.
| wrs wrote:
| Keep reading...he did that. But it's not clear he had the right
| PS fonts installed.
| jgrahamc wrote:
| I probably did not as I did it really fast after someone
| suggested it.
| aidenn0 wrote:
| Somewhat off-topic, but I remember Word for Windows 6.0 would
| take considerable time (like a minute for a 10 page document on
| my AM386DX/40) to reflow paragraphs across page-breaks (trying to
| handle widows, orphans &c). If I made an edit to the first page
| and hit print before it was done, I would end up with a printed
| document that contained either duplicated or dropped lines at
| page boundaries.
| jmclnx wrote:
| I have a few Wang WP Documents from decades ago. I could not open
| them at all. Libreoffice thought they were corrupted Word Docs.
|
| So the concern about some document formats being unreadable is
| still valid. Who knows what obscure proprietary formats exist out
| there.
| jtotheh wrote:
| Tragically, Postscript support has been largely removed from
| MacOS now. Apparently the language was weird enough that
| supporting it made some (in)security hacks possible. I guess I'm
| old ! I remember first finding out about it in 1986 when is very
| "leet". Postscript printers were big $.
|
| I say tragically because Postscript was pretty key in making DTP
| as compelling as it used to be, which kind of saved the Mac in
| terms of being the "killer app" for it.
|
| I think you may be able to run some kind of postscript support in
| some tool from Adobe, or even Ghostscript. And probably, the
| newer software is better, but it's sad that you can't view a
| postscript file on macOS out of the box now.
| jxdxbx wrote:
| Amazing that you can just pop up an emulator in a browser window.
| Retro Mac emulation used to be such a pain in the ass.
| jasomill wrote:
| For anyone interested, here's the document in modern Word format,
| with all vector artwork and fonts intact:
|
| https://jasomill.at/proposal.docx
|
| To convert it, I first opened and re-saved using Word 98[1]
| running on a QEMU-emulated Power Mac, at which point it opened in
| modern Word for Mac ( _viz.,_ version 16.82).
|
| The pictures were missing, however, with Word claiming "There is
| not enough memory or disk space to display or print the picture."
| (given 64 GB RAM with 30+ GB free at the time, I assume the
| actual problem is that Word no longer supports the PICT image
| format).
|
| To restore the images, I used Acrobat (5.0.10) print-to-PDF in
| Word 98 to create a PDF, then extracted the three images to
| separate PDFs using (modern) Adobe Illustrator, preserving the
| original fonts, vector artwork, size, and exact bounding box of
| each image.
|
| At this point, restoring the images was a simple matter of
| deleting the original images and dragging and dropping the PDF
| replacements from the Finder.
|
| For comparison, here's the PDF created by Acrobat from Word 98 on
| the Power Mac
|
| https://jasomill.at/proposal-Word98.pdf
|
| and here's a PDF created by modern Word running on macOS Sonoma
|
| https://jasomill.at/proposal-Word16.82.pdf
|
| [1] https://archive.org/details/ms-word98-special-edition
| jasomill wrote:
| As an aside, MacClippy 98 knew the score:
|
| https://jasomill.at/Clippy.png
| api wrote:
| Today's historic working documents will mostly be SaaS hosted
| documents in systems like Google Docs, Notion, etc. In the future
| nobody will be able to open them. They won't exist, and the
| software won't exist, and there will be no way to restore it
| since the software is SaaS that can't be emulated or even
| installed anywhere.
| willmadden wrote:
| MS word for mac 16.16 opens it with the diagrams intact in
| "compatibility mode". The only issue is the text is indented
| slightly too far on the left.
|
| Libre Office opens it with the same quality, but has some weird
| gray ghost lines around tables.
___________________________________________________________________
(page generated 2024-02-13 23:00 UTC)