Usenet.com

www.Usenet.com

Group Index

Comp Thread Archive from Usenet.com

<-- __Chronological__ --> <-- __Thread__ -->

Re: Can't OCR old faded typewriter text





[EMAIL PROTECTED] wrote:

The original documents were typed on white paper with a typewriter and
probably a worn ribbon.  The characters are light, but quite legible
and very easy to read by eyesight... more of a gray.

Sounds as if you need to scan to files, and then apply some kind of enhancement operators before you feed the result into the OCR program.

But Scan Manager pro will not define the characters. I end up with
about 80% errors. I can't reduce brightness finely enough before it
completely blacks out the preview.

Brightness just moves the b/w-threshold up or down the grey spectrum. Do you have a 'contrast' control? Try fiddling with that.

  I usually scan two or three pages to files, start PaintShop Pro, and
fiddle with the image. There are automatic contrast enhancement -- if that
works, fine. For very broken-up letters, you may need to blur or
'grow' the image first, then do any histogram operation you want
(contrast enhancement), and then perhaps do a sharpening operation at
the end, before you  do the final thresholding to a binary image (or
let your OCR software do it for you).

  Once I've found something that seems to work, the operations can easily
be applied to a whole batch of scans.

  Count on spending at least two full days experimenting with all the
various options of your scanner and OCR software before you're able to
turn out consistent work with originals that are less than perfect in terms
of contrast and clarity.

--
Anders Thulin     [EMAIL PROTECTED]     http://www.algonet.se/~ath




<-- __Chronological__ --> <-- __Thread__ -->


Usenet.com




Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.




Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.