Subj : Re: King/Gassner/Shanahan To : comp.programming,comp.lang.java.programmer,comp.lang.lisp From : Ulrich Hobelmann Date : Mon Aug 29 2005 11:18 am Robert Maas, see http://tinyurl.com/uh3t wrote: > Does anybody know of a free program that does OCR on PDF files and > keeps track of layout so as to convert to reasonable HTML or plain > ASCII? Or if I wrote such a program myself would anyone think it was a > good thing and pay me money for all that effort? Or am I the only > person who thinks that converting a megabyte PDF file to a 30K text > file would be a useful utility? Google does give you a link for each PDF to view that PDF as html. I don't know how those pages look in lynx, though. I'm not sure how to let it convert your own PDF, maybe you need to put in online and wait a few weeks for Google to index it, or maybe you could try to see what google does with the URL and direct it to your PDF file. But then I've also seen PDFs for which Google didn't show the "show as HTML" link. -- My ideal for the future is to develop a filesystem remote interface (a la Plan 9) and then have it implemented across the Internet as the standard rather than HTML. That would be ultimate cool. Ken Thompson .