Last modified: June 09, 2015
This documentation is for those, who want to use the toolkit for tablature recognition, but are not interested in extending its capabilities.
As every historic tablature print uses different tablature symbols, you must first train the system with some pages of lute tablature and then use this training data for subsequent recognition:
Hence the proper use of this toolkit requires the two following steps:
There are two options to use this toolkit: you can either use the toolkit out-of-the-box, or you can build your own scripts with the aid of the python library functions provided by the toolkit. Both options are described in detail below.
In any case, the output of the recognition process is a tablature or music encoding in the abc format.
The abc format was originally invented by Chris Walshaw as a simple ASCII notation for folk tunes. Meanwhile it has been extended by many people to encode a wide variety of music notation. In particular, Christoph Dalitz extended abc to support the notation of lute tablature.
Not every software listed on the abc home page supports all abc extensions. The abc tablature encoding generated by this toolkit is specific for the program abctab2ps, because this is the only abc program which supports lute tablature. See the user's guide of this program for a detailed description of the format.
The abc music transcription however is abc standard conformant and thus should be readable by any abc program. For instance you can use abc2midi to generate a MIDI for "proofhearing" the recognized tablature.
The toolkit provides ready-to-run functionality for all tasks of optical tablature recognition:
Preprocessing can be done interactively from the image contextmenu entry OTR/preprocessing. As this is a cumbersome way to preprocess a batch of images, there is otr_prepare.py as a wrapper script for some plugins. Example:
otr_prepare.py -smooth -deskew -depict scan.tiff -o prepped.png
This will read the file scan.tiff, perform deskew, smoothen and remove_pictures on it and saves the result to prepped.png.
For a full list of possible command line options with an explanation, call otr_prepare.py -?.
Note
If you choose to use some of the routines from the preprocessing plugins, make sure you use the same preprocessing for both training and recognition.
This toolkit provides a special method for starting the training dialog because the training description in the Gamera documentation cannot be directly applied to this OTR toolkit for two peculiarities of lute tablature recognition:
The method described below takes care of these points because it removes the staff lines before segmenting the image and when it opens the classifier window, it loads an appropriate symbol name list.
As training is interactive, you must start the training dialog from gamera_gui with the following steps:
When training a French/Italian tablature page, the following options dialog appears (the dialog for German lute tablature is similar):
Here you can select an image file, and choose preprocessing options. The other options have the following meaning (the last three options are absent for German tablature, because it does not use staff lines):
Note
When you check some preprocessing options, make sure to use the same preprocessing options on all training images and later in the recognition phase.
The image will be segmented, which may take some time because it includes a staff removal step. The original image (with staff lines) will be displayed in a classifier window together with the default symbol list used in this toolkit.
Note
All symbols appear twice in the symbol list on the left: with and without the prefix _group. Make sure you use the right versions during manual classification, i.e. the _group class name for fragmented symbols and the plain class name for unfragmented symbols.
From hereon you can follow the training tutorial in the Gamera documentation which describes how to classify symbols and how to save the training data and merge the training data from several images into a single training database. The possible symbol names are described in the next section.
Note
For training another image do not use the menu "Image/Open and Segment image"! Instead you must close the classifier window and use the toolkit icon context menu again.
The postprocessing routines in TabGlyphList and in otr_recognize.py rely on the glyph names given in the table below.
Symbol Class | Description |
---|---|
fret.a, fret.b, ... | Fret letters or numbers. Note that you must use letters even in italian tablature and that the letter j is not used so that the 9th fret is fret.k. Hence the glyph "2" is trained as fret.c in italian or guitar tablature and the glyph (actually a group) "11" is fret.m Special symbols for diapasons can be trained as fret.a8, e.g., for the empty 8th course. |
flag.1, flag.2, flag.4, flag.8, ... | Rhythm flag for a whole note, half note, quarter note etc. It does not matter whether these are the real time values, essential is only their relative quotient. |
fermata | A fermata. |
bar | Bar line. It is possible to differentiate, e.g. with bar.double or bar.repeat; in that case the symbol interpretation will ignore anything after the first dot. Note that you will only need to train bar lines when you plan to classify bar lines statistically with Gamera's builtin kNN classifier. If you use the deterministic bar line removal TabPage.remove_barlines, training bar lines or groups (when the bar lines are broken) is of no use. |
dot | A dot. Dots can have different meanings in tablature (flag prolongation, index finger, repeat dots, ...) which the symbol interpretation tries to figure out and to classify further. |
deco.star.WHERE, deco.hash.WHERE | A star or hash sign as decoration. To which fret symbol it is attached depends on the modifier WHERE: when WHERE is left or right, it is attached to the fret symbol on the left or right, respectively. When WHERE is here, it is considered as a symbol on its own that replaces a fret letter. |
time.c time.c_cut time.3 | mensural sign |
*trash* ('*' stands for any string) | Glyphs that should be ignored. These will be removed by otr_recognize.py from the glyph list |
You can train other additional names, eg. tenuto signs as deco.tenuto. These names will be kept in the glyph list and sorted by TabGlyphList.set_glyph_properties(), but are ignored by the postprocessing routines which create abc output. You will need to write your own postprocessing for them; see the developer's manual for an example.
Note
Occasionally, one realizes at a later point that the rhytm flags shoud be interpreted in a different way. It may be that what seemed to be sixteenth notes actually are quarter notes, etc. For this situation, there is a script otr_factor_training.py in the subdirectory scripts for automatically changing all rhythm lengths.
The postprocessing routines in GermanGlyph and in otr_german_recognize.py rely on the glyph names described below.
Symbol Class | Description |
---|---|
fret.c1.f0, fret.c1.f1, ... fret.c2.f0, ... | Symbol specifies both course and fret. fret.c1.f0 for instance means "fret 0 on course 1". In contrast to tablatures with stafflines, German tablature symbols uniquely specify course and fret. (see below for examples) |
flag.1.1, flag.2.1, flag.4.2, ... | Rhythm flags with the numbers specifying the time value and the number of grid stems. It does not matter whether these are the real time values, essential is only their relative quotient. (see below for examples) |
flag.fermata | A fermata. |
bar | Bar line. It is possible to differentiate, e.g. with bar.double or bar.repeat; in that case the symbol interpretation will ignore anything after the first dot. Note that you will only need to train bar lines when you plan to classify bar lines statistically with Gamera's builtin kNN classifier. If you use the deterministic bar line recognition, training bar lines or groups (when the bar lines are broken) is of no use. |
bar_part | Fragment of a bar line. |
dot | A dot. Dots can have different meanings in tablature (flag prolongation, index finger, repeat dots, ...) which the symbol interpretation tries to figure out and to classify further. |
stroke, higher_fret | Horizontal strokes have different meanings in German tablature: when over a number it means that this number specifies a fret on the sixth course; when over a character it shifts the fret 5 frets higher. You can thus simply train all horizontal strokes identically and the recogniiton post processing tries to figure the meaning out. |
*trash* ('*' stands for any string) | Glyphs that should be ignored. These will be removed by otr_german_recognize.py from the glyph list |
As German lute tablature is staffless, the symbols uniquely describe course and fret, which are encoded as fret.fn.cm where n is the fret number and m the course. For a typical German tablature using a Fraktur font, the following symbols are used up to the fifth fret:
f0 | f1 | f2 | f3 | f4 | f5 | |
c1 | ||||||
c2 | ||||||
c3 | ||||||
c4 | ||||||
c5 | ||||||
c6 |
Some symbols belong to fret/course symbols like, e.g., the asterisk for indicating held stops. Depending on the print source, these symbols appear before, behind, below or above their main symbol. To keep the toolkit flexible, the positioning of such "secondary" symbols can be trained with the modifier keywords left, right, above or below, e.g. deco.tenuto.right for an asterisk belonging to the symbol to its right.
Concerning rhythm flags, some tablature prints (e.g. German tablature or Phalese) arrange flags in "grids" which can include flags of different time values. For these, we must encode not only the rhythm value as for tablature, but also the number of stems in the grid. This is done with a dot separated list as demonstrated in the following examples:
Flag combination | Class name | Flag combination | Class name |
flag.2.1 = half note on one stem | flag.8.4 = eigth notes on 4 stems | ||
flag.2.1.4.2 = half note on one stem followed by two quarter notes | flag.2.1.4.2.2.1 |
There are two scripts for recognition, depending on the tablature type: otr_recognize.py for French and Italian tablature and otr_german_recognize.py for German tablature without staff lines.
otr_recognize.py takes a tablature image and a training data file as input and creates an abc encoding of the tablature as output. For a full list of possible command line options with an explanation, call otr_recognize.py -?.
Here is a description of the most important options:
The recognition of German lute tablature works similar, except that you must use the script otr_german_recognize.py instead. For a full list of possible command line options with an explanation, call otr_german_recognize.py -?.
The most important options are:
If you want to write your own scripts for recognition, you can use otr_recognize.py as a good starting point. The following sections describe some aspects of writing scripts not only for recognition, but also for training.
In order to be able to use the OTR toolkit functions, you must import them at the beginning of your scripts with:
# French/Italian tablature (with staff lines): from gamera.toolkits.otr.otr_staff import * from gamera.toolkits.otr.otr_glyph import * # German tablature (no staff lines): from gamera.toolkits.otr.german_page import * from gamera.toolkits.otr.german_glyph import * # all tablature types: from gamera.toolkits.otr.plugins import *
Before we can fire up the training dialog from gamera_gui, we need to do some preparations: load an image, do some preprocessing (optional), remove staves and segment the image.
# load an image and do some preprocessing image = load_image("tablature.tiff") image = image.deskew() if image.data.pixel_type != ONEBIT: image = image.to_onebit() # remove staff lines with the TabPage class # and create a glyph list "ccs" with connected component analysis tab = TabPage(image, online=1) tab.remove_staves() ccs = tab.image.cc_analysis()
When the image has a lot of randomly touching symbols, you will need to add a custom segmentation function here.
Now that we have the tablature symbols isolated in the list ccs, we can pass this list to a classifier and open Gamera's training interface:
# create classifier and start training session classifier = knn.kNNInteractive([], ['aspect_ratio', 'moments', 'volume64regions', 'nrows_feature'], 0) classifier.display(ccs, orig_image, symbol_table)
As this script requires the gamera GUI as its runtime environment (for the interactive training session), it is easiest to execute the script from the python shell in gamera_gui:
>>> execfile("/path/to/your/script.py")
Alternatively you could pass the script file name as a command line parameter to gamera_gui or use the Unix #! magic.
The initialization of a standalone script requires some more calls, because in the example above these are called implicitly by gamera_gui:
# general stuff from gamera from gamera.core import * from gamera import knn from gamera.classify import BasicGroupingFunction # otr toolkit from gamera.toolkits.otr.otr_staff import * from gamera.toolkits.otr.otr_glyph import * from gamera.toolkits.otr.plugins import *
The first step in recognition is the same as in training: load an image, do some preprocessing (optional), remove staves and segment the image. From the glyph list we create a TabGlyphList object, which we will need for postprocessing:
# load an image and do some preprocessing image = load_image("tablature.tiff") image = image.deskew() if image.data.pixel_type != ONEBIT: image = image.to_onebit() # remove staff lines with the TabPage class tab = TabPage(image, online=1) tab.remove_staves() # create TabGlyphList, glyphs are property "glyphs" ccs = tab.image.cc_analysis() glyphs = TabGlyphList(ccs)
Now we can load the training data into a classifier and use it for classifying the tablature symbols. Here we assume for simplicity that bar lines are also recognized statistically with the kNN classifier:
# create classifier and load training database classifier=knn.kNNInteractive([], ['aspect_ratio', 'moments', 'volume64regions', 'nrows_feature'], 0) classifier.from_xml_filename("trainingdata.xml") # classify symbols and group them # note that grouping does not modify the list, and thus we need # to add the groups afterwards and remove their parts explicitly grp_distance = tab.staffspace_height/2 (added, removed) = classifier.group_list_automatic(glyphs.glyphs, \ BasicGroupingFunction(grp_distance), max_parts_per_group=3) if len(added) > 0: # remove group parts glyphs.glyphs = [x for x in glyphs.glyphs if (not x.match_id_name("_group._part.*")] # add found groups glyphs.glyphs.extend(added)
Now all individual symbols are identified, but their meaning is not yet known (eg. which glyphs belong together in a chord, to which course a fret letter applies etc.). This semantic interpretation is done with TabGlyphList.set_glyph_properties. Afterwards we can export to abc:
# set the semantic properties (course etc.) of and convert to abc tablature and music code glyphs.set_glyph_properties(tab) tabcode = glyphs.to_abctab() musiccode = glyphs.to_abcmusic(tuning="gdAFCG,", factor=1) # write abc code to a file abc_file=open("song.abc", "w") abc_file.write(tabcode) abc_file.write("\n\n") abc_file.write(musiccode) abc_file.close()
Now you have a file song.abc containing the tablature code and a music transcription in abc. You can create a nice looking postscript file from this with abctab2ps and create a Midi file from the music transcription with abcselect and abc2midi:
abctab2ps -O song.ps song.abc abcselect -X 2 song.abc > music.abc && abc2midi music.abc -o music.midi