Newsgroups: comp.text
Path: utzoo!sq!lee
From: lee@sq.sq.com (Liam R. E. Quin)
Subject: Re: OCR as a way to measure printer quality
Message-ID: <1991Mar27.030244.3074@sq.sq.com>
Summary: This is measuring OCR quality, not printer quality!
Keywords: PostScript, OCR, HP
Organization: SoftQuad Inc., Toronto, Canada
References: <1991Mar22.224813.7086@cbnewsl.att.com>
Date: Wed, 27 Mar 91 03:02:44 GMT
Lines: 38

npn@cbnewsl.att.com (nils-peter.nelson) wrote:
> I went through by hand and found:
>	6 errors in 1728 characters on HP
>	10 errors in 1716 characters on Kodak
> (the errors are all in the OCR, *not* in the printer).
>
>As a free plug to HP, [...] the IIISi, at 15ppm for
>under $6,000 and outstanding quality, is absolutely
>worth considering.


Although I don't (yet) have an opinion on the HPIIISi (apart from disliking
the name!), I do wonder whether it is entirely reasonable to equate quality
with OCR recognition.

Typographical features such as the use of ligatures and kerning often make
life harder for an OCR program but easier for the human eye, to give a
simple example.

Since the documents in question were not produced by the same software,
it seems a little unfair to Kodak.  If this metric were universally
established, the best printers would be those that used the OCR fonts that
one sees on cheques, followed closely by monspaced fonts like Courier.

The error rates quoted seem slightly on the high side to me, but I am not
up to date with OCR software, and perhaps the text was in small sizes.

The people for whom this metric _would_ be useful are those who accomplish
file transfer by printing out a document and then scanning it in on another
system... and yes, I have seen this done!    :-(

Lee

-- 
Liam R. E. Quin,  lee@sq.com, SoftQuad Inc., Toronto, +1 (416) 963-8337
``Agree, for Law is costly. -- Very good advice to litigious Persons, founded
  upon Reason and Experience; for many Times the Charges of a Suit exceed the
  Value of the Thing in Dispute.''	Bailey's dictionary, 1636
