Optical Character Recognition
Revision as of 04:53, 17 January 2012 by AlexanderR (moved OCR to Optical Character Recognition: Article naming guidelines)
There are several steps to the whole OCR process, the actual OCR engine is only part of this:
- document layout analysis
- optical character recognition
- post-processing (formatting, PDF creation)
OCR (Optical Character Recognition) Engines
- CuneiForm — A command line OCR system originally developed and open sourced by Cognitive technologies. Supported languages: eng, ger, fra, rus, swe, spa, ita, ruseng, ukr, srp, hrv, pol, dan, por, dut, cze, rum, hun, bul, slo, lav, lit, est, tur.
- GOCR/JOCR — An OCR engine which also supports barcode recognition.
- Ocrad — An OCR program based on a feature extraction method.
- Tesseract — "Probably one of the most accurate open source OCR engines available".
Layout analysers and user interfaces
- YAGF — graphical interface for the CuneiForm text recognition program on the Linux platform. Available from community repository
- gscan2pdf — scans, runs Tesseract and creates a PDF all in one go
- Kooka — scanner GUI for KDE which supports the OCR engines GOCR, Ocrad or KADMOS. Used to be part of kdegraphics4, but dropped out due to lack of development
- OCRFeeder — Python GUI for Gnome which performs document analysis and rendition, and can use either CuneiForm, GOCR, Ocrad or Tesseract as OCR engines. It can import from PDF or image files, and export to HTML or OpenDocument. Available from AUR
- OCRopus — OCR platform, modules exist for document layout analysis, OCR engines (it can use Tesseract or its own engine), natural language modelling, etc. Available from AUR