[HN Gopher] Researchers are pulling movements out of microfilm w...
       ___________________________________________________________________
        
       Researchers are pulling movements out of microfilm with digital
       history
        
       Author : rbanffy
       Score  : 24 points
       Date   : 2021-07-23 18:31 UTC (4 hours ago)
        
 (HTM) web link (vtx.vt.edu)
 (TXT) w3m dump (vtx.vt.edu)
        
       | marcodiego wrote:
       | Hard to evaluate if that is wise. Microfilm is a much easier to
       | read media than any digital form that achieves the same density.
        
         | kragen wrote:
         | It seems like a pure improvement if they're still preserving
         | the microfilms. Any digital form can be copied indefinitely
         | without degradation; microfilm can't. And a library of digital
         | copies is enormously cheaper than a library of microfilm.
         | 
         | A reasonable source for film costs might be
         | http://www.matthewwagenknecht.com/the-actual-costs-of-film/
         | which says 16mm (color) movie film is US$197 for a 400-foot
         | roll. That would be enough for 12000-20000 A4-sized pages at
         | typical microfilm linear reduction ratios of 20-25, so about
         | 1C/-1.5C/ per page. Black-and-white or diazo film might be a
         | little cheaper, but probably not more than a factor of 2.
         | Microfilm stock might actually be more expensive (finer grain,
         | lower-volume product?)--does anybody know?
         | 
         | By contrast, a page of ASCII text is about 4KB, or 1.2KB if
         | gzipped (which, yes, could create risks to accessibility). A
         | new 2TB disk costs US$45. So making a copy of the
         | transcriptions these volunteers created costs 0.000003C/ per
         | page. For the price of a single 400-foot roll of 16mm film,
         | capable of holding tens of thousands of pages, you can buy
         | _five_ 2TB disks, which can hold 1.7 _billion_ pages each, and
         | put a copy of your corpus to be preserved on each one.
         | 
         | If we're looking at preserving scans of the pages, well, I have
         | a PDF scan of Volume 3 of Dr. Dobb's Journal here from the
         | Internet Archive. The pages are scanned at 2550x3300 (300 dpi)
         | in grayscale, which is considerably better than any quality
         | I've ever seen on library microfilm, and the 484 pages weigh in
         | at 232MB, 500 kB per page. So a US$45 2TB disk can only hold 4
         | million pages scanned at this sort of archival quality, a cost
         | of 0.001C/ per page. That's still _literally a thousand times
         | cheaper_ than microfilm.
         | 
         | Hard disks have some real problems with longevity: paper will
         | last 1000 years if well-treated, microfilm will typically last
         | a century (though it tends to get scratched if used, and nearly
         | all microfilm is acetate rather than PET, so many microfilms
         | succumb to vinegar syndrome in only a couple of decades), but
         | disks tend to develop problem with stiction in only 10-20
         | years, and they wear out and fail catastrophically sooner than
         | that if they're in use. Right now the strategy is to copy the
         | digital data to new media every few years, and try to use
         | archival file formats.
         | 
         | So we need to do much, much better. But the costs are now so
         | low that a hobbyist can put ten thousand scanned books in their
         | pocket; in an afternoon they can make a digitally-perfect,
         | checksummed archival copy of a thousand books, using a couple
         | of dollars of disk space and BitTorrent. In the microfilm age
         | that would have been impossible for anyone but a librarian, and
         | a nontrivial cost for the library.
         | 
         | You sound like you might be interested in
         | https://dercuano.github.io/topics/archival.html, which
         | discusses ways to improve archival, especially digital
         | archival. I think there are some straightforward approaches to
        
       | swiley wrote:
       | Man using the output of some ML as "history" sounds like a really
       | really bad idea.
        
       | huachimingo wrote:
       | Github's Artic Vault comes to mind:
       | https://archiveprogram.github.com/arctic-vault/
       | 
       | See also Fallout's "Brotherhood of Steel" ;)
        
       ___________________________________________________________________
       (page generated 2021-07-23 23:00 UTC)