class MusicStaves_rl_carter

Last modified: October 27, 2015

Contents

MusicStaves_rl_carter

In module gamera.toolkits.musicstaves.musicstaves_rl_carter

This code is a clean room re-implementation of the staff removal algorithm described in Carter's PhD thesis:

Carter, N. Automatic recognition of printed music in the context of electronic publishing. Ph.D. Thesis. Depts. of Physics and Music. University of Surrey.

Carter's approach segments the image using a concept known as the line-adjacency graph. Each segment resulting from this analysis is either part of a staffline or not, so once the sections have been found, no further analysis at the pixel level is necessary. Stafflines are found by finding obvious straight and horizontal candidates for staffline sections (called filaments), and then finding filaments that are evenly spaced from one another. These filaments are used to build models of the stafflines (modeled in this implementation as a set of connected line segments) and then thin sections that fall along these modeled lines are removed.

Noteably, Carter's algorithm does not perform any deskewing or rotation to straighten the stafflines, and the line model allows the stafflines to be slightly curved and rotated below a certain (unknown) threshold. The stafflines are not even required to be parallel within a staff. Any systems using the results of this algorithm will have to take the non-straight and non-horizontal line models into account when determining the pitches of noteheads etc.

The LAG is different from the common-case graph as used in the Gamera graph library in that each node must have two distinct sets of undirected edges (leftward and rightward).

There is (at least) one hole in Carter's description where an assumption was made.

"Once a stave had been found, each staveline was tracked across the page in both directions and all sections which fell within the projected path of the staveline and were below the threshold for permitted filament thickness were flagged as staveline sections."

How the "projected path" is calculated is not clear, so I have implemented something that seems to work and at least doesn't contradict Carter's thesis. Each staffline filament is used as a starting point and the graph is traced in both directions looking for additional line segments that are part of the staff. If a likely staffline candidate is not found directly through the graph, a second more expensive search is performed that allows for gaps in the staff lines. The projected path of the line is computed by using a least squares fit on the last n points in the opposite direction of tracing, where n is a relatively large number, in this case staffspace_height * 8.0. This keeps the projected path relatively free of errors in the angle, and allows for a moderate amount of local curvature. (Using the projected path of each line segment individually caused too many over-skewed projected angles, particularly on short, noisy segments. In some cases, these projected lines would cross into adjacent stafflines and all kinds of terrible things would happen.)

Additionally, Carter does not specify a threshold as to when a set of filaments are evenly spaced enough to be considered a set of stafflines. From experimentation I arrived at

if (|dist - (staffspace_height + staffline_height)| <= 3) then is staff

but that may fail in some untested cases.

Author:Michael Droettboom after an algorithm by Nicholas Carter.

remove_staves

Detects and removes staff lines from a music/tablature image.

Signature:

remove_staves(crossing_symbols='all', num_lines=5, adj_ratio=2.5, noise_size=5, alpha=10.0, beta=0.1, angle_threshold=3.0, string_join_factor=0.25)

with

crossing_symbols:
Ignored.
num_lines:
The number of stafflines in each staff. (Autodetection of number of stafflines not yet implemented).

It is unlikely one would need to provide the arguments below:

adj_ratio
The maximum ratio between adjacent vertical runs in order to be considered part of the same section. Higher values of this number will result in fewer, larger sections.
noise_size
The maximum size of a section that will be considered noise.
alpha
The minimum aspect ratio of a potential staffline section.
beta
The minimum "straightness" of a potential staffline section (as given by the gamma fittness function of a least-squares fit line).
angle_threshold
The maximum distance from the mean angle a section may be to be considered as a potential staffline (in degrees).
string_join_factor
The threshold for joining filaments together into filament strings.
debug

When True, returns a tuple of images to help debug each stage of the algorithm. Times for each stages are also displayed. The tuples elements are:

result
The normal result, with stafflines removed.
sections
Shows the segmentation by LAG. Use display_ccs to show the segments differently coloured.
potential_stafflines
Shows the sections that are potentially part of stafflines
filament_strings
Shows how the vertically aligned potential stafflines are grouped into filament_strings
staffline_chunks
The filament strings believed to be part of stafflines
stafflines
Draws the modelled lines on top of the image.

get_staffpos

Returns the y-positions of all staff lines at a given x-position. Can only be called after remove_staves.

Signature:

get_staffpos(x=0)

Since the lines are modelled as connected sets of non-horizontal line segments, x is relevant to the result.

The return value is a list of StaffObj.