[HN Gopher] Original WWW proposal is a Word for Macintosh 4 file...
       ___________________________________________________________________
        
       Original WWW proposal is a Word for Macintosh 4 file from 1990, can
       we open it?
        
       Author : jgrahamc
       Score  : 313 points
       Date   : 2024-02-13 14:06 UTC (8 hours ago)
        
 (HTM) web link (blog.jgc.org)
 (TXT) w3m dump (blog.jgc.org)
        
       | zdw wrote:
       | If you wanted exactly what would have been printed, on the
       | emulator running Word for Mac 4.0 you should be able to install a
       | print queue that can generate a .ps (Postscript) file, which
       | would could be converted to PDF.
       | 
       | Or Acrobat may be available for that old of an OS and would have
       | a virtual print driver to go directly to PDF.
        
         | detourdog wrote:
         | I know I have running Macs with Word 5.1a which I consider the
         | last Word version needed. I'm sure I opened Word 4.0 files.
        
           | kps wrote:
           | Yes, a few years ago I helped a friend recover a bunch of old
           | documents. The solution was to use Mac Word 5 to open the
           | Word 4 files and save them as something newer versions could
           | read.
        
         | jgrahamc wrote:
         | Ah. Great suggestion! I just used Print2PDF to make a PDF from
         | Word. Will update the blog.
        
         | chrisfinazzo wrote:
         | https://web.mit.edu/ghostscript/www/Ps2pdf.htm
         | 
         |  _Or, if you prefer to do more tweaking yourself, dive into the
         | Ghostscript deep end :)_
         | 
         | https://www.ghostscript.com
        
       | msephton wrote:
       | LibreOffice opens it right up. It's support for old document file
       | formats is really excellent. I keep it around for just this
       | purpose. https://imgur.com/a/JENgq6V
       | 
       | But I also love using BasiliskII and InfiniteMac emulators!
        
         | Karellen wrote:
         | > LibreOffice opens it right up. It's support for old document
         | formats is really excellent.
         | 
         | Yes, the OP also mentions that LibreOffice opens it.
         | 
         | ...but they also point out with LibreOffice that "Although
         | there's something weird about the margins and there are other
         | formatting problems." - which is also apparent in your
         | screenshot? Certainly that level support for such an old
         | proprietary format is pretty good, but I'm not sure I'd class
         | it as "really excellent" with those issues.
        
           | jgrahamc wrote:
           | Yes, LibreOffice opened it right up with the wrong font
           | sizes, headers and footers messed up, incorrect gutter and
           | margins, and a bunch of other problems. But they were all
           | fixable.
        
           | msephton wrote:
           | I should have been clearer: what I meant was that its support
           | for very many different old document formats is excellent.
           | Atari ST, Amiga, Macintosh, and so on. The OP and you are
           | quite right that it won't open the documents with exactly the
           | right formatting, but it's good enough in a pinch so you
           | don't have to learn how to use 40 year old computers. It's a
           | good tool to have.
           | 
           | 7zip has similar support for a wide range of compressed file
           | formats, exes, data files, cabinets, and so on. Another good
           | tool to save time and keep you on your modern operating
           | system.
        
         | sigspec wrote:
         | Yeah we read the article--- which matches your screenshot.
        
           | msephton wrote:
           | This is for all the TL;DR folks.
        
             | jgrahamc wrote:
             | I think your summary is a bit short. Sure, LibreOffice
             | opens the file but there are multiple problems with the
             | formatting that need correcting. Your screenshot shows at
             | least one of them (there shouldn't be any headers on the
             | first page and the page layout should be different).
        
         | graemep wrote:
         | LibreOffice was the first thing I tried, and it worked with no
         | problem.
        
           | jgrahamc wrote:
           | Well, except for all the problems I outlined in the post.
        
             | soperj wrote:
             | headline says "open" and libreoffice opened it with no
             | problem.
        
               | TaylorAlexander wrote:
               | I simply opened the file with my hex editor. Problem
               | solved. (sarcasm)
        
               | jgrahamc wrote:
               | I actually opened it in emacs in hexl-mode before I ran
               | the file command!
        
               | skissane wrote:
               | In the past, I have in all seriousness read Microsoft
               | Word documents on Linux using less. I might have had
               | LibreOffice installed, but it can't run over SSH.
               | 
               | It works okay with most old school (pre-XML) ones, since
               | the document text is in the file in plain ASCII amidst
               | all the binary formatting stuff. For the new XML formats,
               | less by itself doesn't do anything useful, but unzip them
               | and you can read the XML containing the document text.
        
         | lizknope wrote:
         | Yeah, I stopped reading the article, downloaded the file, the
         | only word processor is in Libre Office. It seemed to work fine
         | so I didn't know what the issue was. Then I read the article
         | and kept scrolling to the end where the author finally uses
         | LibreOffice and it opens mostly okay.
        
         | vdaea wrote:
         | So does Word 2019 for Windows.
        
           | jgrahamc wrote:
           | Is the formatting correct? Are the images visible? Because
           | others report (see other comments) that Word opens the file
           | but the images are missing. See the Word generated PDF here:
           | https://news.ycombinator.com/item?id=39359079
        
             | vdaea wrote:
             | Yes, you are right, apologies. I thought it wouldn't open
             | at all, like in the screenshot in that blog post.
        
         | ogurechny wrote:
         | Well, StarOffice already existed back then. Now I wonder
         | whether LibreOffice still has some early '90s third party
         | format parsing code inside, or some reverse engineered
         | compatibility and conversion code from much later Word version
         | actually does the job.
        
       | arnaudsm wrote:
       | Great cautionary tale about how quickly formats get obsolete,
       | especially closed source ones.
       | 
       | I use markdown, plaintext and png for all the documents I need to
       | store long term.
       | 
       | Even if these formats disappear, I could trivially reimplement my
       | own parser.
        
         | ComputerGuru wrote:
         | Isn't markdown plaintext? (I didn't downvote.)
        
           | williamcotton wrote:
           | Isn't HTML plaintext?
           | 
           | ;)
        
             | ComputerGuru wrote:
             | Yes, but not intended to be directly human readable by
             | contrast.
        
               | Narishma wrote:
               | If it wasn't intended to be human readable it would have
               | been a binary format.
        
               | robinsonb5 wrote:
               | It may have been intended to be human readable, but it
               | failed dismally in that goal.
               | 
               | Even before the web turned into the javascript infested
               | swamp that is now, the tags having the same visual weight
               | as the text they enclose made it tiring to read.
               | 
               | Markdown's genius is in the formatting tags being almost
               | no hindrance to readability.
        
               | williamcotton wrote:
               | I definitely agree that Markdown is more readable than
               | markup, but personally I abhor what some frameworks do to
               | HTML. I make sure my HTML is legible! There is even a
               | benefit when it comes to hyperlinks in that you can _see_
               | the URL!
        
         | elzbardico wrote:
         | As a society we should have been thinking more about digital
         | preservation since the time we started eschewing archiving hard
         | copies in paper.
         | 
         | People who don't know history are doomed to repeat it, but how
         | can our future generations learn from our mistakes if all our
         | documents are unreadable or lost by their time?
        
           | zokier wrote:
           | Are you just casually dismissing all the work that digital
           | archivists have done over the past couple of decades?
           | 
           | https://www.loc.gov/librarians/standards
           | 
           | https://www.loc.gov/preservation/digital/
           | 
           | https://www.loc.gov/programs/digital-collections-
           | management/...
           | 
           | and that's just Library of Congress, they are hardly alone in
           | this field
        
         | kragen wrote:
         | implementing a markdown parser is far from trivial
         | 
         | implementing a parser that tricks people into believing it
         | parses markdown because it acts like a markdown parser in
         | simple cases is what is trivial
         | 
         | it's likely that your markdown data will indeed be recoverable,
         | but if you're generating it yourself, html is probably safer
        
           | arnaudsm wrote:
           | Parsing markdown is multiple orders of magnitude easier than
           | Microsoft Word, especially before docx.
           | 
           | And it has the merit to be human-readable in plaintext!
        
             | kragen wrote:
             | that's probably true
        
           | jprete wrote:
           | But the Markdown document doesn't actually need a parser to
           | still be usable. Markdown as a whole imitates the conventions
           | of typed text. The table formats would even be usable on an
           | old typewriter.
        
             | kragen wrote:
             | markdown doesn't have tables, although you can include html
             | <table> tags in it.                   perhaps you mean
             | indented fixed-width blocks             you can use for
             | ascii art               or typewriter-style tables?
        
               | kelnos wrote:
               | Sure it does. It may not be in the original standard, but
               | many/most parsers support tables that use pipe characters
               | to separate columns.
               | 
               | And regardless, markdown documents -- including the table
               | extension -- are readable without a parser.
        
               | kragen wrote:
               | extensions to markdown aren't markdown; that's why
               | commonmark is called commonmark
               | 
               | not being able to tell which variant of a language is in
               | use is one of the biggest problems for archival, and in
               | particular various extensions to the microsoft word
               | format (all made by the same company!) were what made
               | jgc's archival work so difficult in this case
               | 
               | language extensions are an especially bad problem when
               | there's no extension mechanism--because sometimes a pipe
               | is just a pipe. but unfortunately markdown's only
               | extension mechanism is html
        
               | samatman wrote:
               | It's called CommonMark because Gruber insisted. Not
               | because extensions to markdown aren't Markdown(r), which
               | no one cares about, and not because it isn't markdown in
               | the ways that matter.
               | 
               | Ironically, his objection was to the idea of a single and
               | rigorous standard, you'll note that Git-flavored markdown
               | never drew his wrath. And yet you're treating him and
               | Swartz's implementation as if it was such a standard.
               | Which it is not.
        
           | zilti wrote:
           | Or org-mode format. Then you even get tables properly.
        
           | samatman wrote:
           | The (only) issue is that Markdown isn't a format, it's a
           | loose family of formats with many extensions. Implementing a
           | parser Commonmark is not an especially difficult task in the
           | grand scheme of things, it's quite well specified and has an
           | extensive test suite.
           | 
           | Although I find myself wondering what this "parsing Markdown"
           | business is even about. It's perfectly legible as plain text,
           | that was the main design principle behind it. If the goal is
           | to have your data accessible in future, if you can read it
           | now, and you don't go blind, you'll be able to read it later
           | as well.
        
           | inopinatus wrote:
           | strictly speaking, markdown is a superset of html
        
       | dzdt wrote:
       | Somehow the author doesnt recognize that emulation is a
       | legitimate answer to this question. Yes he was able to open the
       | document, by using the original software on a highly accurate
       | emulation of the original system. Everything beyond that point is
       | a different question: can we get it inside of a modern word
       | processor.
        
         | jgrahamc wrote:
         | Sort of. What I wanted was to be able to get a PDF version of
         | it. I was hoping that a modern word processor would read the
         | file format, and LibreOffice did. But it's also true that using
         | emulation I was able to get a PDF (albeit one that has
         | different fonts).
        
           | nextaccountic wrote:
           | > it's also true that using emulation I was able to get a PDF
           | (albeit one that has different fonts).
           | 
           | Maybe you needed to have the right fonts installed in your
           | emulated mac? Another comment in this thread pointed out this
        
         | londons_explore wrote:
         | Emulation is starting to get gaps too... for example, running
         | Windows 95 in an emulator on a modern machine is getting harder
         | and harder (emulators like vmware and virtualbox don't emulate
         | the CPU speed accurately, which causes the system not to boot,
         | and they also don't emulate various paging behaviours of old
         | intel CPU's accurately which causes windows applications to
         | crash within a few seconds of starting).
         | 
         | There are binary patches to windows 95 to fix these issues, but
         | as the system gets older it's less likely people will put
         | effort into binary patching it for compatibility with modern
         | systems. And if it were more obscure, you'd be SOL.
        
           | fourfour3 wrote:
           | Whole system emulation like 86box does a much better job of
           | emulating older hardware and OSes - I use it quite a bit for
           | DOS/Win3.11/Win9x era stuff.
        
           | thawkth wrote:
           | PCem is far, far better for Win95 emulation - it can handle a
           | P2 233 and a Voodoo3 fairly accurately - and tons and tons of
           | hardware on top of that.
           | 
           | It's amazing. I keep a 95 / 98 and some other vintage
           | machines around as a hobby, but being able to play Unreal in
           | an emulator with 3D acceleration blows my mind
        
           | Narishma wrote:
           | Those are virtual machines, not emulators. If you use a
           | proper emulator like PCem or 86box, Windows 95 works fine.
        
           | thaumasiotes wrote:
           | > running Windows 95 in an emulator on a modern machine is
           | getting harder and harder (emulators like vmware and
           | virtualbox don't emulate the CPU speed accurately, which
           | causes the system not to boot, and they also don't emulate
           | various paging behaviours of old intel CPU's accurately which
           | causes windows applications to crash within a few seconds of
           | starting)
           | 
           | I thought the normal way to run Windows 95 was in dosbox?
        
       | markus92 wrote:
       | As a testament to Microsoft's backwards compatibility: the file
       | opened mostly fine in the Windows version of Word (version 2401),
       | and the layout seems to be identical to the PDF of the article.
       | It did block the file format by default but that was easy enough
       | to allow.
       | 
       | The graphics did not open however, due to a missing graphics
       | filter for the Microsoft Word Picture format. Seem it's been
       | deprecated for a while now but Word 2003 should be able to open
       | it? Which is old, but not _that_ old not to run on modern
       | systems.
        
         | markus92 wrote:
         | Installed a copy of Word 2003, document opened flawlessly
         | immediately with default settings. Saving it from there
         | converted it to a modern .doc which I could open with Office
         | 365 and convert to PDF etc.
         | 
         | I think the moral of the story is that the Windows Office team
         | seems to spend a bit more time on backwards compatibility.
        
           | jgrahamc wrote:
           | I would be interested to see a PDF generated from Office 365
           | to understand how flawless it really is.
        
             | zokier wrote:
             | Here you go, exported from desktop Word to PDF.
             | 
             | https://drive.google.com/file/d/1lnaSr22l3kQbmFHnxg3Ggd3-46
             | v...
             | 
             | Full version string:
             | 
             | Microsoft(r) Word for Microsoft 365 MSO (Version 2311 Build
             | 16.0.17029.20140) 64-bit
        
               | jgrahamc wrote:
               | Right. So all the images are missing. LibreOffice still
               | gives the best conversion I think.
        
               | markus92 wrote:
               | Yeah, that's why you need Word 2003 for the images, it's
               | a deprecated format full of security holes I guess.
        
               | giancarlostoro wrote:
               | Ah... yeah I was wondering why they would deprecate an
               | image format at all. My understanding is that Word in the
               | old days serialized what was in memory, maybe that was a
               | little too exploitable with images?
               | 
               | Not sure just curious not even sure where to look that
               | one up honestly.
        
               | zokier wrote:
               | Digging through the files a bit I think the images are in
               | PICT format which is very specific to Macs (the original
               | ones). Its not surprising that modern Word doesn't
               | support those that well as they are actually somewhat
               | complicated kinda-vector image format. I am surprised
               | that even Word 2003 implemented PICT on Windows.
        
               | zokier wrote:
               | Probably the completely best would be to use LO for the
               | images and Word otherwise... needs some manual twiddling
               | but I suspect that way you can get pretty much perfect
               | layout and images.
        
           | ogurechny wrote:
           | Office applications up to (and probably including) version
           | 2010 break and crash on latest Windows versions. That
           | behavior varies based on Office service packs and updates
           | installed. You were lucky to be able to just _save the
           | document_.
           | 
           | Unless, of course, you've found some _portable version_ on
           | the net that packs ThinApp and an assortment of old system
           | libraries under the hood.
        
       | crazygringo wrote:
       | I'm surprised he didn't try an intermediate version of Word --
       | not the original Word 4.0 for Mac, but not the current online
       | version of Word either.
       | 
       | I had a lot of old Word 4.0 for Mac files at one point, and
       | remember some point in the late 1990's or early 2000's opening
       | them all up in a version of Word for Windows, and then re-saving
       | them in a more up-to-date Word format. I believe there was an
       | official converter tool Microsoft provided as a free add-on or an
       | optional install component -- it wouldn't open the "ancient" Word
       | formats otherwise.
       | 
       | There's definitely going to be a chain here of 1 or 2
       | intermediate versions of Word that should be able to open the
       | document perfectly and get it into a modern Word format, I should
       | think -- and I'm curious what the exact versions _are_. (Although
       | as other people point out, if you don 't need to edit it, then
       | exporting it as PostScript in Word 4.0 and converting it to PDF
       | works fine too.)
        
       | elzbardico wrote:
       | I am deeply disappointed that a company like Microsoft doesn't
       | make a point of Microsoft Word being able to open any document
       | created by any version of Word, no matter how ancient it is. I
       | think they have the social/historical/economical responsibility
       | of doing so.
       | 
       | If they are worried about vulnerabilities in the old parsing
       | code, move it to an external process, run it under isolation in a
       | sandbox to spit out a newer readable version on the fly, but
       | don't eliminate this capability from the software.
       | 
       | EDIT: zokier pointed out to me that the desktop version of Word
       | opens the file fine, it is only the web version that doesn't. So,
       | consider this post void.
       | 
       | EDIT 2: Well it opens the document, but is not able to display or
       | print the embedded graphics, it seems.
        
         | OJFord wrote:
         | You don't have to go _anywhere near_ 1990 to find issues with
         | modern Microsoft (especially cloud) apps opening documents
         | created in older ones!
        
         | zokier wrote:
         | You missed the fact that the real Word does open this file just
         | fine, its just the toy web version that has issues (and maybe
         | Mac too but eh)
        
           | elzbardico wrote:
           | Oh, really? I stand corrected. Thanks for pointing this out.
        
             | jgrahamc wrote:
             | No, you're not wrong, another commenter points out that
             | latest Word opens the document but doesn't display the
             | graphics.
        
           | ben7799 wrote:
           | The Office 365 Mac version refuses to open it.
           | 
           | You can recover text but the result is horrible. No graphics
           | and all formatting lost.
        
           | jgrahamc wrote:
           | Yes, it opens it and throws away the graphics, so not "just
           | fine".
        
             | zokier wrote:
             | If we go into splitting hairs, it doesn't really throw the
             | graphics away, it simply lacks the "filter" to display them
             | but they are there still, as in it recognizes the graphics
             | object correctly and lays out it on the page. Based on the
             | error message, hypothetically I suppose you could even make
             | a custom filter to handle the object.
             | 
             | But this really goes more into the facet of Office files
             | that allowed embedding pretty much anything into them, and
             | relying on this "filter" system (I guess OLE) to handle
             | embedded objects. So while the DOC file itself is getting
             | parsed and rendered pretty much perfectly, the embedded
             | objects are another story.
             | 
             | In the same sense I'd say browser might open some HTML page
             | "fine" even if it doesn't know how to handle some image
             | format that is used on the page; it'd still handles the
             | HTML correctly.
        
               | jdofaz wrote:
               | Makes me wonder if the graphics are in PICT format
        
               | zokier wrote:
               | I think they are. You can even find some PICT files
               | inside the ODT in the github from TFA
        
               | petersmagnusson wrote:
               | if you read the blog, the main point of OP's project was
               | to get at the diagrams, so hardly "splitting hairs".
        
           | nullindividual wrote:
           | This is expected with the web versions of Office. They can
           | read (certain) binary Office formats but not edit them. The
           | web version of Office is designed for OpenXml file formats.
        
         | nullindividual wrote:
         | Old file formats have security vulnerabilities. The online
         | version of Word is designed for docx only, although it can open
         | certain binary documents.
        
           | o11c wrote:
           | Fundamentally, a data file format can't have vulnerabilities.
           | At most it can be prone to vulnerabilities, but more often
           | it's just that popular implementations are bad.
        
             | nullindividual wrote:
             | Sorry, the Word parser does and Microsoft did not feel it
             | important enough to fix as their focus is on OpenXml
             | formats.
        
               | kelnos wrote:
               | Then that's on Microsoft. There's no fundamental reason
               | why a secure parser can't be written for old formats.
        
               | nullindividual wrote:
               | Why would Microsoft do that? It makes zero financial
               | sense to continue with a parser that may need to be
               | rewritten from scratch for a ~30 year old format.
        
               | genewitch wrote:
               | they can do what they want, and i'll continue on my 2
               | decade long decision to never give microsoft money, for
               | anything. Same way i'll never give propellerhead another
               | dime, or Plex[0], or any of these other consumer-hostile
               | companies.
               | 
               | I don't trust MS to maintain software, even though as far
               | as that goes, they're better than a lot of companies that
               | have been writing software for decades. "time marches on"
               | is silly when we have millions of times the compute,
               | storage, and transit speeds available to us. I also don't
               | see why people see the need to shill for multi-billion
               | dollar companies.
               | 
               | What microsoft should have done is trademark a new name
               | for their word processor the second they made the
               | decision to not open word .doc from older versions. That
               | way there's no confusion.
               | 
               | [0] having a hard time remembering the name/company of
               | the software i purchased for in-house streaming over a
               | decade ago. Plex is still a hassle to use for in-house
               | streaming compared to the "service" or whatever they're
               | selling. Unfortunately Synology seems to have grown weary
               | of releasing a version of their client for every
               | newfangled device that comes to market, so i'm stuck with
               | plex on my TV; that is, unless i want to use a stick/set-
               | top/computer attached to it.
        
               | nullindividual wrote:
               | > I don't trust MS to maintain software
               | 
               | Then you should champion removal of any "old" software
               | they have that is under maintenance-only status. You
               | wouldn't want security vulnerabilities to go unfixed,
               | would you?
               | 
               | > What microsoft should have done is trademark a new name
               | for their word processor the second they made the
               | decision to not open word .doc from older versions. That
               | way there's no confusion.
               | 
               | That makes zero sense. Word is still Word. It performs
               | the same tasks (and more) as Word 1.0 did.
               | 
               | And Word today still reads/writes .doc, just not versions
               | that are that old.
        
           | kelnos wrote:
           | No they don't. Parsers can have security vulnerabilities, but
           | you can fix those, and there's little reason why a parser for
           | an old format would have more vulnerabilities than for a new
           | format. Some formats can also have certain (intended)
           | features that have security implications, but parsers can
           | choose to disable them if they are concerned.
        
         | larsrc wrote:
         | Many old formats were essentially just binary dumps of memory,
         | or something not far removed. Documenting the formats was not a
         | standard. Yes, I agree that there is a social responsibility,
         | but having worked in digital archiving I can tell you that the
         | olden days were really, really messy. No, really.
        
           | resters wrote:
           | This is the point that many of the commenters who criticize
           | Microsoft are missing, and it's why the old formats are not
           | enabled by default (security vulnerabilities) and why it's
           | not as simple as creating a parser.
        
             | autoexec wrote:
             | Microsoft still deserves criticism for designing their old
             | word formats so badly. It was a design choice to turn
             | documents of mostly text into obscure binary formats that
             | were badly standardized and maintained.
        
               | resters wrote:
               | Not true at all. Some of Microsoft's best minds created
               | _extremely ingenious_ methods that allowed early word
               | processors to be usable on files that were dramatically
               | larger than what would fit in memory. OSes didn 't
               | support suitable performance via VM infrastructure at the
               | time. It was clever, outside of the box thinking that got
               | MS to be able to beat WordPerfect (a worthy competitor)
               | and the many other also-rans.
               | 
               | There was (contrary to popular belief) not a deliberate
               | strategy to limit interoperability. It was simply the
               | reality of the approaches utilized that made them tightly
               | coupled to the MS Word codebase and less standardizable
               | than would have otherwise been ideal.
               | 
               | Source: one of the guys who worked on it at MS.
        
       | rietta wrote:
       | Extremely interesting and thank you for doing this. I feel
       | strongly that this goes to show just how important preserving
       | historical software and emulation is. I have dabbled myself with
       | old Windows 3.1 software for this very reason. We really, truly
       | are going to have a period where web application driven software
       | just disappears and we wont easily have this retro computing view
       | of these decades in a short time from now.
        
         | dfxm12 wrote:
         | I also think it is important to show the importance of open
         | formats or open source in general if we want future generations
         | to read our documents or run/compile/understand our software.
        
       | CharlesW wrote:
       | _[silly pre-coffee post deleted]_
        
         | jgrahamc wrote:
         | Word is already available on the Infinite Mac as it's under
         | Productivity inside the Infinite HD. No need to install it.
        
       | whoopdedo wrote:
       | > That way I can see actual fonts, font sizes and layout to
       | confirm how the document should have looked.
       | 
       | Or you would if you had the original fonts. Word 4.0 was released
       | for System 6 with support as far back as System 3.2. Fonts at
       | that time had separate screen and printer files for the different
       | output resolutions. If you're missing the printer font it'll
       | print a scaled (using nearest-neighbor) rendering of the screen
       | font. If you're missing the screen font it'll substitute the
       | system font. (Geneva by default, as seen in the screenshot.)
       | 
       | In this case, only the well-known Palatino and Courier typefaces
       | are needed. But LibreOffice substituted Times New Roman even
       | though I have Palatino Linotype installed.
        
         | jgrahamc wrote:
         | That may go some way to explaining some of the differences I
         | see, but the main thing I was looking for in the emulation was
         | the font sizes.
        
           | aidenn0 wrote:
           | Doesn't the font matter almost as much as the font-size
           | setting for font sizes, given that different font families
           | can have wildly different metrics at the same font size?
        
             | jgrahamc wrote:
             | I bet it does. I should redo the final part after
             | installing the required fonts.
        
       | stuaxo wrote:
       | This is good.
       | 
       | It would be good to get some feature requests into libreoffice to
       | fix the remaining mis-matches in the formatting.
        
       | scaglio wrote:
       | This rises a potential problem, often underrated by companies:
       | some have backups with _infinite_ retention.
       | 
       | It is common to have backups with retention of 10 years, some may
       | have 20 years for legal reasons... but the majority of people
       | don't understand the difference between "readable" and "usable".
       | 
       | Of course, it depends on the data... And there are companies
       | backing up whole _virtual machines_ with infinite retention,
       | believing to be able to run them: it is hard enough to restore a
       | vSphere 5.x machine on a brand new vSphere 8, I really don 't
       | understand this waste of space.
        
         | actionfromafar wrote:
         | Often an old file or disk image is tiny compared to modern file
         | sizes.
         | 
         | So the waste of space is more of an administrative character
         | than a waste of _disk_ space.
        
         | rvnx wrote:
         | If you backup all, you can sort later, and even eventually
         | never. It costs 1 USD per month at Google Cloud to store 1TB of
         | data.
         | 
         | At this price it's not worth sorting, when one single devops
         | costs 100 USD+ per hour, not including the opportunity cost of
         | not working on something more productive (and less boring for
         | the developer).
         | 
         | Then X years after the company is acquired, or sufficient time
         | has lapsed, you can delete / drop the data without sorting.
         | 
         | Regarding virtual machines, if it's VMDK for example, you can
         | read the raw disks without booting it, and again, it's not
         | worth taking a risk to lose data to potentially save 10 USD per
         | month, which is similar to one developer taking one beer extra
         | at a team event.
        
       | anonymouskimmer wrote:
       | WordPerfect claims the ability to open MS Word 4.0 files. The
       | standard edition is currently $175. I'm not buying it, but if
       | you're willing to spend $175 it might be something to try.
        
       | caboteria wrote:
       | Yet another example of why Apache needs to take OpenOffice behind
       | the barn.
        
         | EasyMark wrote:
         | You mean retire it to a nice farm upstate, little Jimmy might
         | hear the shotgun blast!
        
       | acheron wrote:
       | "Here's a 4000 year old letter from a merchant to his partners
       | describing how to avoid taxes by smuggling goods in their
       | underwear." ( https://www.britishmuseum.org/blog/trade-and-
       | contraband-anci... )
       | 
       | vs
       | 
       | "Not sure if it's possible to read this 30 year old file!"
        
         | kelnos wrote:
         | I get the point you're trying to make, but your former example
         | is rare. While there are more exceedingly-old paper records
         | that are still around and have been preserved than we might
         | expect, we've lost so, so much. Paper and ink (and variations
         | on that) are both fragile.
         | 
         | Digital documents are otherwise easy to preserve indefinitely,
         | if care is taken up-front to choose a simple document format
         | that is likely to remain parseable (or at least documented) for
         | a long time. And even when you don't do that, there's always
         | the possibility of writing a parser later (assuming
         | documentation is around) or reverse-engineering the format.
         | 
         | And in this case, the 30-year-old file did end up getting
         | opened, albeit not as trivially easily as one might hope.
        
           | thaumasiotes wrote:
           | > but your former example is rare. While there are more
           | exceedingly-old paper records that are still around and have
           | been preserved than we might expect, we've lost so, so much.
           | Paper and ink (and variations on that) are both fragile.
           | 
           | Depends what you mean by "rare". Ancient Near Eastern
           | correspondence isn't rare at all, precisely because they
           | didn't use paper. (And they went to war a lot.) You seem to
           | be writing as if that letter was a paper document, but it
           | isn't. Paper records that old only exist in Egypt.
           | 
           | > Digital documents are otherwise easy to preserve
           | indefinitely, if care is taken up-front to choose a simple
           | document format that is likely to remain parseable (or at
           | least documented) for a long time.
           | 
           | This isn't a good match to the example either; Ancient Near
           | Eastern records had to be deciphered. (The Semitic ones had
           | to be deciphered. The Sumerian ones benefited from surviving
           | documentation, but we had to find that and learn how to read
           | it.)
           | 
           | The original example isn't particularly apt; reading this
           | 30-year-old file, or a similar one, is a task that one guy
           | can do in less than a week using existing tools and know that
           | he's done it correctly. Reading a 4000-year-old cuneiform
           | letter was a much larger project than that.
        
       | melomac wrote:
       | I was able to download and transfer the proposal document to a
       | Mini vMac emulator, set the Finder's type and creator to those of
       | a Microsoft Word 5 document i.e. respectively WDBN and MSWD, and
       | finally open the document with Microsoft Word 5 for Mac to export
       | it as a RTF document.
       | 
       | Here you have it: https://neko.melomac.net/tmp/proposal.rtf
       | 
       | I certainly agree opening a document from this Macintosh era
       | should be, by far, easier than the process I detailed below, but
       | this is how it is -\\_(tsu)_/-
        
         | jgrahamc wrote:
         | Thanks. Unfortunately, the images are all missing.
        
           | melomac wrote:
           | It is even more frustrating that the image are in the
           | document, and Microsoft Word for Mac would still display them
           | accurately.
           | 
           | And LibreOffice would display the images in the RTF document
           | in a different size (a tiny block).
           | 
           | If my old Mac display would work, I could have been able to
           | send the document over to CUPS via Netatalk, and make a PDF
           | out of it. Unfortunately Mini vMac can't connect to that VM
           | on the LAN...
           | 
           | Anyhow, it is scandalous that opening legacy documents became
           | such a PITA.
        
       | bluedino wrote:
       | That Mac Word screenshot gives me claustrophobic flashbacks to
       | trying to work on those tiny screens in middle school computer
       | lab, writing science fair papers.
        
         | cynicalsecurity wrote:
         | It wasn't so bad. It's better now, but it was fine back then.
        
           | whoopdedo wrote:
           | I consider it more of not knowing how much better we could
           | have had it. Small monitors were "normal." But I imagine
           | people who got to work with the Portrait Display[1] (an
           | impressive 640x870 resolution!) felt then as we do now when
           | they had to switch back to the internal screen.
           | 
           | [1] https://wiki.preterhuman.net/Apple_Macintosh_Portrait_Dis
           | pla...
        
         | retrac wrote:
         | Heh, that screenshot is relatively high-resolution for the time
         | in question, too. 800x600 maybe? The compact Macs were 512x342:
         | https://www.betalogue.com/images/uploads/microsoft/pce-mac-w...
         | (The toolbars, rulers, etc., could be hidden in the settings.)
        
       | cranberryturkey wrote:
       | libreoffice opened it.
        
         | kelnos wrote:
         | Sure, but the layout was screwed up and the fonts and sizes
         | were wrong.
         | 
         | Certainly this is helpful: it's better to be able to open a
         | document and then have to manually fix those issues than to be
         | unable to open it at all. But it was far from perfect.
        
           | EasyMark wrote:
           | It's orders of magnitude better than "I can't open this file
           | at all, -1"?
        
       | Sembiance wrote:
       | This does an "okay" job at converting the document:
       | https://archive.org/details/KeyViewPro
       | 
       | Here is the converted PDF:
       | https://smallpdf.com/result#r=091f20f23de353fac21376a3a49a60...
        
         | jgrahamc wrote:
         | Not sure that's really true. It did something but the images
         | are a mess and a lot of formatting is gone. I think LibreOffice
         | is still the winner here.
        
       | bilsbie wrote:
       | I wonder if it would be a viable business to keep running
       | versions of computers going back say 40 years and offering to
       | recover and convert files for people. (Just getting stuff off
       | floppy disks and Zip drives might be useful)
        
       | traceroute66 wrote:
       | Interestingly, the latest and greatest version (desktop app via
       | Office365) of Microsoft Word on Mac appears to know what it is
       | _but_ refuses to open it.
       | 
       | If you drag the file onto Word, it launches a dialogue box
       | telling you "proposal uses a file type that is blocked from
       | opening in this version" along with a link to the supporting page
       | on the Microsoft website[1].
       | 
       | [1] https://support.microsoft.com/en-us/office/error-filename-
       | us...
        
         | worik wrote:
         | > telling you "proposal uses a file type that is blocked from
         | opening in this version"
         | 
         | "blocked"?
         | 
         | That sounds like Microsoft has some IP problems with their old
         | software.
        
       | aidenn0 wrote:
       | Normally I have good success with abiword, but it completely
       | barfs on this file; it seems to be falling back on its RTF
       | support.
        
       | noufalibrahim wrote:
       | One underappreciated (though mentioned) hero in this little saga
       | is the venerable file(1) command.                     proposal:
       | Microsoft Word for Macintosh 4.0
       | 
       | It's so incredibly useful and so easily overlooked. I almost
       | reflexively reach out to it when I'm curious about a file and the
       | information it returns is just sufficient to satiate my curiosity
       | and be useful.
        
         | cpach wrote:
         | I agree, _file_ is such a great tool.
         | 
         | I have cursed so many times in the past when I sat in front of
         | a work computer that ran Windows and didn't have this tool
         | easily available. (Later on, WSL made life easier, but now I'm
         | luckily nearly Windows-free.)
        
           | AdamJacobMuller wrote:
           | One might even say that file has a lot of magic in it.
        
             | pdmccormick wrote:
             | file has a lot of magic, but a file typically has only one
             | magic.
        
       | dorfsmay wrote:
       | LibreOffice is amazing, beside being able to open many document
       | formats, it can run headless and has command line options which
       | allow automating some tasks such as converting format that would
       | not be possible otherwise.
       | 
       | https://help.libreoffice.org/latest/en-US/text/shared/guide/...
       | 
       | https://opensource.com/article/21/3/libreoffice-command-line
        
       | j45 wrote:
       | https://www.ebay.com/itm/235033043066
       | 
       | The original word for macOS software seems more than available.
        
       | Dwedit wrote:
       | Is there a way to make a PS or PDF file using the actual Word for
       | Macintosh 4? I'd think that would be the definitive render.
        
         | wrs wrote:
         | Keep reading...he did that. But it's not clear he had the right
         | PS fonts installed.
        
           | jgrahamc wrote:
           | I probably did not as I did it really fast after someone
           | suggested it.
        
       | aidenn0 wrote:
       | Somewhat off-topic, but I remember Word for Windows 6.0 would
       | take considerable time (like a minute for a 10 page document on
       | my AM386DX/40) to reflow paragraphs across page-breaks (trying to
       | handle widows, orphans &c). If I made an edit to the first page
       | and hit print before it was done, I would end up with a printed
       | document that contained either duplicated or dropped lines at
       | page boundaries.
        
       | jmclnx wrote:
       | I have a few Wang WP Documents from decades ago. I could not open
       | them at all. Libreoffice thought they were corrupted Word Docs.
       | 
       | So the concern about some document formats being unreadable is
       | still valid. Who knows what obscure proprietary formats exist out
       | there.
        
       | jtotheh wrote:
       | Tragically, Postscript support has been largely removed from
       | MacOS now. Apparently the language was weird enough that
       | supporting it made some (in)security hacks possible. I guess I'm
       | old ! I remember first finding out about it in 1986 when is very
       | "leet". Postscript printers were big $.
       | 
       | I say tragically because Postscript was pretty key in making DTP
       | as compelling as it used to be, which kind of saved the Mac in
       | terms of being the "killer app" for it.
       | 
       | I think you may be able to run some kind of postscript support in
       | some tool from Adobe, or even Ghostscript. And probably, the
       | newer software is better, but it's sad that you can't view a
       | postscript file on macOS out of the box now.
        
       | jxdxbx wrote:
       | Amazing that you can just pop up an emulator in a browser window.
       | Retro Mac emulation used to be such a pain in the ass.
        
       | jasomill wrote:
       | For anyone interested, here's the document in modern Word format,
       | with all vector artwork and fonts intact:
       | 
       | https://jasomill.at/proposal.docx
       | 
       | To convert it, I first opened and re-saved using Word 98[1]
       | running on a QEMU-emulated Power Mac, at which point it opened in
       | modern Word for Mac ( _viz.,_ version 16.82).
       | 
       | The pictures were missing, however, with Word claiming "There is
       | not enough memory or disk space to display or print the picture."
       | (given 64 GB RAM with 30+ GB free at the time, I assume the
       | actual problem is that Word no longer supports the PICT image
       | format).
       | 
       | To restore the images, I used Acrobat (5.0.10) print-to-PDF in
       | Word 98 to create a PDF, then extracted the three images to
       | separate PDFs using (modern) Adobe Illustrator, preserving the
       | original fonts, vector artwork, size, and exact bounding box of
       | each image.
       | 
       | At this point, restoring the images was a simple matter of
       | deleting the original images and dragging and dropping the PDF
       | replacements from the Finder.
       | 
       | For comparison, here's the PDF created by Acrobat from Word 98 on
       | the Power Mac
       | 
       | https://jasomill.at/proposal-Word98.pdf
       | 
       | and here's a PDF created by modern Word running on macOS Sonoma
       | 
       | https://jasomill.at/proposal-Word16.82.pdf
       | 
       | [1] https://archive.org/details/ms-word98-special-edition
        
         | jasomill wrote:
         | As an aside, MacClippy 98 knew the score:
         | 
         | https://jasomill.at/Clippy.png
        
       | api wrote:
       | Today's historic working documents will mostly be SaaS hosted
       | documents in systems like Google Docs, Notion, etc. In the future
       | nobody will be able to open them. They won't exist, and the
       | software won't exist, and there will be no way to restore it
       | since the software is SaaS that can't be emulated or even
       | installed anywhere.
        
       | willmadden wrote:
       | MS word for mac 16.16 opens it with the diagrams intact in
       | "compatibility mode". The only issue is the text is indented
       | slightly too far on the left.
       | 
       | Libre Office opens it with the same quality, but has some weird
       | gray ghost lines around tables.
        
       ___________________________________________________________________
       (page generated 2024-02-13 23:00 UTC)