[HN Gopher] Researchers are pulling movements out of microfilm w...
___________________________________________________________________
Researchers are pulling movements out of microfilm with digital
history
Author : rbanffy
Score : 24 points
Date : 2021-07-23 18:31 UTC (4 hours ago)
(HTM) web link (vtx.vt.edu)
(TXT) w3m dump (vtx.vt.edu)
| marcodiego wrote:
| Hard to evaluate if that is wise. Microfilm is a much easier to
| read media than any digital form that achieves the same density.
| kragen wrote:
| It seems like a pure improvement if they're still preserving
| the microfilms. Any digital form can be copied indefinitely
| without degradation; microfilm can't. And a library of digital
| copies is enormously cheaper than a library of microfilm.
|
| A reasonable source for film costs might be
| http://www.matthewwagenknecht.com/the-actual-costs-of-film/
| which says 16mm (color) movie film is US$197 for a 400-foot
| roll. That would be enough for 12000-20000 A4-sized pages at
| typical microfilm linear reduction ratios of 20-25, so about
| 1C/-1.5C/ per page. Black-and-white or diazo film might be a
| little cheaper, but probably not more than a factor of 2.
| Microfilm stock might actually be more expensive (finer grain,
| lower-volume product?)--does anybody know?
|
| By contrast, a page of ASCII text is about 4KB, or 1.2KB if
| gzipped (which, yes, could create risks to accessibility). A
| new 2TB disk costs US$45. So making a copy of the
| transcriptions these volunteers created costs 0.000003C/ per
| page. For the price of a single 400-foot roll of 16mm film,
| capable of holding tens of thousands of pages, you can buy
| _five_ 2TB disks, which can hold 1.7 _billion_ pages each, and
| put a copy of your corpus to be preserved on each one.
|
| If we're looking at preserving scans of the pages, well, I have
| a PDF scan of Volume 3 of Dr. Dobb's Journal here from the
| Internet Archive. The pages are scanned at 2550x3300 (300 dpi)
| in grayscale, which is considerably better than any quality
| I've ever seen on library microfilm, and the 484 pages weigh in
| at 232MB, 500 kB per page. So a US$45 2TB disk can only hold 4
| million pages scanned at this sort of archival quality, a cost
| of 0.001C/ per page. That's still _literally a thousand times
| cheaper_ than microfilm.
|
| Hard disks have some real problems with longevity: paper will
| last 1000 years if well-treated, microfilm will typically last
| a century (though it tends to get scratched if used, and nearly
| all microfilm is acetate rather than PET, so many microfilms
| succumb to vinegar syndrome in only a couple of decades), but
| disks tend to develop problem with stiction in only 10-20
| years, and they wear out and fail catastrophically sooner than
| that if they're in use. Right now the strategy is to copy the
| digital data to new media every few years, and try to use
| archival file formats.
|
| So we need to do much, much better. But the costs are now so
| low that a hobbyist can put ten thousand scanned books in their
| pocket; in an afternoon they can make a digitally-perfect,
| checksummed archival copy of a thousand books, using a couple
| of dollars of disk space and BitTorrent. In the microfilm age
| that would have been impossible for anyone but a librarian, and
| a nontrivial cost for the library.
|
| You sound like you might be interested in
| https://dercuano.github.io/topics/archival.html, which
| discusses ways to improve archival, especially digital
| archival. I think there are some straightforward approaches to
| swiley wrote:
| Man using the output of some ML as "history" sounds like a really
| really bad idea.
| huachimingo wrote:
| Github's Artic Vault comes to mind:
| https://archiveprogram.github.com/arctic-vault/
|
| See also Fallout's "Brotherhood of Steel" ;)
___________________________________________________________________
(page generated 2021-07-23 23:00 UTC)