
www.Usenet.com
| <-- __Chronological__ --> | <-- __Thread__ --> |
Impossible. As long as a human reader is able to comfortably read it, any state-of-the-art OCR software should be able to handle it.
I'm not sure where the requirement of reading the text comfortably comes from?
This is where the adversary modelling comes in. How much does he want to get OCR done? If it's reasonably short text, why not touch-type it from the screen? If it's long, why not send it to a typing sweat-shop off-shore? Timing could be an issue, in which case it's a question of delaing the adversary as much as possible: to make it a huge job.
An average user will be stopped by the OCR program complaining about resolution too low -- there's been several posts on that earlier.
Even if the adversary does have a top line OCR program, does it handle 8x5 sized letters well? Especially if sampling problems are introduced so that an 'a' will produce thirty different bit patterns, as well as run into the immediately preceding and following glyphs?
(Try scanning an ordinary text as 60 dpi b/w -- that's approximately the effect I'm going for. A human can read it, with some difficulty, but an ordinary OCR program doesn't get enough information about segments to work properly. )
I think it's quite possible to do OCR on a lot less dpi than 200.
Try the scenario I've just suggested: 60 dpi b/w. Please note: it's not a question if it can be done at any cost, but whether some other process would be cheaper (in time or work). If typing by hand is cheaper, and gives a better failure rate, there's no point in using OCR.
At 60 dpi automatic despeckling becomes a serious problem, unless you can turn it off. Another of those things the average user will be stopped by. And even without despeckling, letters run together in a way that makes them very difficult to split automatically.
Just load the picture into any graphics program and juggle the palette for maximum contrast.
So why can't the adversary unjuggle it? Again, a general user would probably stop in confusion when the OCR program produces nothing. A more experienced user would understand the problem, and just change the colours into something that could be easily distinguished after greyscale conversion. If the colours are 'flat', it's trivial. If the colours are grainy -- as after adding noise -- it will again take a bit of work to do it.
I made a few tests, and in each I could distinguish foreground and background -- not by eye, but with histogram stretching and similar techniques. The question becomes 'just how much effort is protection worth? How much time will an adversary be prepared to spend'? And at what point is OCR not the best solution anymore? I'm just trying to push it over that point.
-- Anders Thulin [EMAIL PROTECTED] http://www.algonet.se/~ath
| <-- __Chronological__ --> | <-- __Thread__ --> |
Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.