Gamera Addon: GreekOCR Toolkit

This is a Gamera toolkit for building text recognition applications for polytonal (classical) Greek. It is based on the Gamera framework and requires a working installation of both Gamera and the Gamera OCR toolkit.

This toolkit has been ported to Gamera 4 and Python 3.

About the GreekOCR toolkit

The GreekOCR Toolkit is an optical character recognition (OCR) systems for polytonal Greek text documents, i.e. Greek texts with a wide variability of accents. It is currently in an experimental stage and requires still extensive testing, but is nevertheless already usable. It provides:

Further improvements and complementary tools to this toolkit and results of Greek OCR can be found on Bruce Robertson's website on Greek OCR.

Documentation

Detailed documentation is included with the source code package in the subdirectory doc/html.

For testing purposes, we provide a basic demo package greekocr-demo.tar.gz, which includes a small test image, corresponding training data and symbol tables that can be useful for avoiding class name typos during training. See the file README for usage examples.

Authors and Achnowledgements

The authors of the GreekOCR toolkit are:

We are grateful to Georgios K. Michalakis for initiating this project and to the Association Stoudion for financial support of parts of the development.

Software Download

The latest version of the source code of the GreekOCR toolkit is freely available from github under the terms of the GNU General Public License. Note that the toolkit relies both on Gamera and the OCR toolkit and therefore requires both software packages to be installed.

See the file INSTALL or doc/html/index.html for installation instructions and the userguide for usage instructions. Available file releases for Gamera 4 and Python 3.x (will install the pip-package gamera-greekocr):

If you still use Python 2 and Gamera 3, here is the last old version for that environment: