When comparing Tesseract vs GOCR, the Slant community recommends Tesseract for most people. In the question“What are the best OCR libraries?” Tesseract is ranked 1st while GOCR is ranked 2nd. The most important reason people chose Tesseract is:
A full list of languages can be found [here](https://github.com/tesseract-ocr/langdata).
Ranked in these QuestionsQuestion Ranking
Pro Understands 40 languages
A full list of languages can be found here.
Pro Free, open source and cross-platform
Tesseract is licensed under the Apache with source code available on GitHub. It's available for free on Windows, Linux and OSX.
Pro Multiple front-ends available
Multiple GUIs have been developed for Tesseract, including OCRFeeder.
Pro Can work with custom training data
By inputting custom .traineddata files, Tesseract can adapt to different styles of text accurately.
Pro Easy, straightforward use
GOCR is very easy to use and it's callable from the command line. Just type
gocr -h and you will have all the available commands with the needed information on how to use them.
Pro Can be used with different frontends
Since GOCR is compatible with multiple different frontends, it can be very easy to port it to different operating systems.
Con Couldn't OCR a clean pdf saved to file (containing images only), converted to pnm (GOCR native format)
Con Fails with multiline layouts
GOCR does not work very well with multiline layouts.