Gamera Addon: GreekOCR Toolkit
This is a Gamera toolkit for building text recognition applications for polytonal (classical) Greek. It is based on the Gamera framework and requires a working installation of both Gamera and the Gamera OCR toolkit.
This toolkit has been ported to Gamera 4 and Python 3.
About the GreekOCR toolkit
The GreekOCR Toolkit is an optical character recognition (OCR) systems for polytonal Greek text documents, i.e. Greek texts with a wide variability of accents. It is currently in an experimental stage and requires still extensive testing, but is nevertheless already usable. It provides:
- two different approaches for dealing with accents (wholistic versus separatistic)
- two different output formats (Unicode or LaTeX utilizing the Teubner style)
- a ready-to-run python script greekocr4gamera.py which acts as a basic GreekOCR-system. Note however, that the character training must be done beforehand by the user: the toolkit does not include any training data.
Further improvements and complementary tools to this toolkit and results of Greek OCR can be found on Bruce Robertson's website on Greek OCR.
Documentation
Detailed documentation is included with the source code package in the subdirectory doc/html.
For testing purposes, we provide a basic demo package greekocr-demo.tar.gz, which includes a small test image, corresponding training data and symbol tables that can be useful for avoiding class name typos during training. See the file README for usage examples.
Authors and Achnowledgements
The authors of the GreekOCR toolkit are:
We are grateful to Georgios K. Michalakis for initiating this project and to the Association Stoudion for financial support of parts of the development.
Software Download
The latest version of the source code of the GreekOCR toolkit is freely available from github under the terms of the GNU General Public License. Note that the toolkit relies both on Gamera and the OCR toolkit and therefore requires both software packages to be installed.
See the file INSTALL or doc/html/index.html for installation instructions and the userguide for usage instructions. Available file releases for Gamera 4 and Python 3.x (will install the pip-package gamera-greekocr):
- gamera-greekocr-2.0.0.tar.gz (Apr 28, 2025)
If you still use Python 2 and Gamera 3, here is the last old version for that environment:
- greekocr-1.0.1.tar.gz (Sep 19, 2011)
