Gamera Addon: GreekOCR Toolkit

This is a Gamera toolkit for building text recognition applications for polytonal (classical) Greek. It is based on the Gamera framework and requires a working installation of both Gamera and the Gamera OCR toolkit.

This toolkit currently still requires Gamera 3 and Python 2.x.

About the GreekOCR toolkit

The GreekOCR Toolkit is an optical character recognition (OCR) systems for polytonal Greek text documents, i.e. Greek texts with a wide variability of accents. It is currently in an experimental stage and requires still extensive testing, but is nevertheless already usable. It provides:

Further improvements and complementary tools to this toolkit and results of Greek OCR can be found on Bruce Robertson's website on Greek OCR.

Documentation

Detailed documentation is included with the source code package in the subdirectory doc/html.

For testing purposes, we provide a basic demo package greekocr-demo.tar.gz, which includes a small test image, corresponding training data and symbol tables that can be useful for avoiding class name typos during training. See the file README for usage examples.

Authors and Achnowledgements

The authors of the GreekOCR toolkit are:

We are grateful to Georgios K. Michalakis for initiating this project and to the Association Stoudion for financial support of parts of the development.

Software Download

The source code of the GreekOCR toolkit is freely distributed under the terms of the GNU General Public License. Note that the toolkit relies both on Gamera and the OCR toolkit and therefore requires both software packages to be installed. Available file releases are:

For release notes, see the file CHANGES. For installation and usage instructions see the file doc/html/index.html in the source package. When all prerequisites are installed, installation simply requires typing

python setup.py build && sudo python setup.py install