For no really good reason, I found myself looking up OCR (“Optical Character Recognition”) components on the weekend. OCR is the technology that looks at a scanned image of a page and figure out the typewritten text (handwriting recognition is usually an even harder problem).
(Yes, this is the kind of thing I do on my weekends. Yes, I know I’m a total nerd).
I didn’t find any open source Java implementations. There are a few commercial products, though:
So, since I was already being a complete nerd, I downloaded the package and pondered some ways of integrating it with Java: JNI? Rewrite it?
And somewhere around there, I found myself reading the source code and thinking: “boy howdy, have I forgotten a lot of C++”.