Usenet.com

www.Usenet.com

Group Index

Comp Thread Archive from Usenet.com

<-- __Chronological__ --> <-- __Thread__ -->

Re: prevent OCR






Bob wrote:


I was thinking of typing the text in a graphics editor and saving it
as GIF format, but how could I modify the final image so it makes OCR
impossible (or at least hard).

Some ideas:


  Reduce resolution.  OCR needs something on the order of 200 dpi,
but it's possible for a person to read text with much lower resolution,
though with increasing diffculty. If you can ensure that each
instance of the same glyph becomes different each time, it will
be a bit tricky to retrieve the text, I think.

  Write the text in colour A, on background in colour B, and select A
and B so that they translate to the same greyscale.  (This works
very well in other contexts, too: I used to have a pure grey-scale monitor,
on which very saturated colours tended to convert to black. So
web sites using black text on fully saturated yellow background, tended to
be fully black. Unless the OCR program has some good colour-to-grey
conversion, it will get confused. Are they that good? Haven't tested ...)

  But, if there's the slightest difference in grey scale, the text *can* be
retrieved, using simple histogram-based techniques.

  So perhaps it would be better to noise up the FG and the BG separately
(in terms of grey level contents) before merging them. Opticians
have eye test charts for finding various types of colour blindness --
if you have seen them, you'll get the idea.

These are just random ideas -- I haven't tested them.

Not that I am against OCR, it is very useful, but it is to protect
some important document I want to make available to some people.
It's ok if it is difficult to convert to text, but I don't want to
send some plain text that anybody could cut/paste anywhere.

It's just a question of who the adversary is: a general user, someone who is prepared to go to a little or to a lot of trouble to get at the text, or perhaps even a fully funded government organization? You probably can't protect yourself against the latter ...


-- Anders Thulin [EMAIL PROTECTED] http://www.algonet.se/~ath




<-- __Chronological__ --> <-- __Thread__ -->


Usenet.com




Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.




Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.