
www.Usenet.com
| <-- __Chronological__ --> | <-- __Thread__ --> |
Is anybody aware of an OCR solution for recognizing partly striked-out text (the sort of which appears in published redacted government files)? As you probably know, there is a judicial inquiry underway right now in the UK, trying to find out about the circumstances surrounding the death of chief chem/bio weapons expert Dr. Kelly. Documents (like email printouts, notes, etc.) related to the inquiry are being posted to http://www.the-hutton-inquiry.org.uk, but it happens that seemingly interesting parts (especially involving the efforts to link Iraq to al-Quaeda) have been redacted with black marker. A good example would probably be this: http://www.the-hutton-inquiry.org.uk/content/cab/cab_11_0077to0078.pdf As you can see, some parts of the redacted text can be made out by the naked eye, but others also only partly covered by the marker (sometimes top and/or bottom parts of letters visible) could use some statistical, automated help. Can anyone point me to a readily available application/toolset optimized for this task (free software preferred, operating system doesn't really matter)? If anybody wants to have a go at it themselves, I'd appreciate if you posted your results (or mailed them to me), indicating the methodology/software used. Thanks.
| <-- __Chronological__ --> | <-- __Thread__ --> |
Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.