Last modified: February 14, 2023
Contents
A kd-tree is multidimensional generalization of a binary search tree. It can be used efficiently for range queries and nearest neighbor searches, provided the dimension is not to high. In document analysis problems, the dimension is typically two, so that kd-trees can be a powerful utility for layout analysis problems.
For a detailed description of the present kd-tree implementation with examples for use cases, see
C. Dalitz: Kd-Trees for Document Layout Analysis. In C. Dalitz (Ed.): "Document Image Analysis with the Gamera Framework." Schriftenreihe des Fachbereichs Elektrotechnik und Informatik, Hochschule Niederrhein, vol. 8, pp. 39-52, Shaker Verlag (2009)
Additional background information on kd-trees can be found in the following literature:
Here is an example for looking up the three nearest neighbors to a given point from a set of sample points:
from gamera.kdtree import *
points = [(1,4), (2,4), (1,5), (3,6), (8,9),
(2,7), (4,4), (5,5), (4,6), (8,3)]
nodes = [KdNode(p) for p in points]
tree = KdTree(nodes)
# neighbors to a sample point not from the set
point = [5,6]
k = 3
knn = tree.k_nearest_neighbors(point, k)
print "%i neighbors of (%i,%i):" % (k,point[0], point[1]),
for node in knn:
print "(%i,%i)" % (node.point[0], node.point[1]),
print "" # final newline
# neighbors to a sample point from the set
# we must query k+1 neighbors, because one of them is
# the sample point (the first entry in the returned list)
point = [5,5]
k = 3
knn = tree.k_nearest_neighbors(point, k+1)
print "%i neighbors of (%i,%i):" % (k,point[0], point[1]),
for node in knn[1:]:
print "(%i,%i)" % (node.point[0], node.point[1]),
print "" # final newline
The property KdNode.data can store an arbitrary Python object associated with the point. The following example represents each connected component by its middle point and stores the actual CC with the point in the node:
ccs = image.cc_analysis()
nodes = [KdNode([cc.center.x,cc.center.y], cc) for cc in ccs]
It is also possible to search for only those neighbors that fulfil a certain predicate. To this end, you must define a function or callable class that takes a KdNode as input and returns True when it fulfils the search predicate. The following example only returns nearest neighbors below the search point:
# predicate definition as callable class
class predicate(object):
def __init__(self, point):
self.point = point
def __call__(self, node):
return (point[1] > node.point[1])
# find the three nearest neighbors below
point = (5,6)
knn = tree.k_nearest_neighbors(point, 3, predicate(point))
Beware however, that a search predicate can have the effect that less than k neighbors are returned, because too few nodes fulfil the predicate. In such a case, the runtime is O(n) instead of O(log(n)). In other words, a poorly chosen search predicate can have a considerable negative impact on the runtime. In some cases, modifying the distance metric (method set_distance) can be an alternative to a search predicate [Dalitz09].
Each KdNode has two properties:
- point
- The geometric location of the node as a sequence of coordinate numbers. The coordinate numbers can be floats or ints. This is an immutable property, because changing the geometric location of nodes belonging to an already built kd-tree obviously breaks subsequent search operations.
- data
- An arbitrary Python object connected to the location point.
KdNode (point, data = None)
The KdNode constructor creates a new node for use in a kd-tree.
point must not be of the Gamera data type Point, but a sequence of numerical values. The optional parameter data can be used to store arbitrary additional information connected to the location point.
Each kd-tree is represented by instances of the KdTree class. Even though there are general kd-tree algorithms to add and remove nodes dynamically (see [Bentley1975]), the present implementation does not support alteration of a once built tree. This has the consequence that tree nodes must be passed to the constructor of KdTree.
A KdTree has the following (read only) properties:
- dimension
- The dimension of the kd-tree. This is automatically determined by the constructor.
KdTree (nodes, distance_type = 2)
The KdTree constructor creates a new kd tree in O(n*log(n)) time from the given list of nodes.
The nodes in the list nodes must be of type KdNode. The dimension of the tree is automatically taken from the length of nodes[0].point.
The parameter distance_type specifies the distance measure that is to be used for nearest neighbor searches. It can be 0 (Linfinite or maximum norm), 1 (L1 or city block norm), or 2 (L2 or euklidean norm).
set_distance (distance_type, weights = None)
Sets the distance metrics used in subsequent k nearest neighbor searches.
distance_type can be 0 (Linfinite or maximum norm), 1 (L1 or city block norm), or 2 (L2 or euklidean norm).
weights is a list of floating point values, where each specifies a weight for a coordinate index in the distance computation. When weights are provided, the weight list must have exactly d entries, where d is the dimension of the kdtree. When no weights are provided, all coordinates are equally weighted with 1.0.
k_nearest_neighbors (point, k, predicate = None)
Returns the k nearest neighbors to the given point in O(log(n)) time. The parameter point must not be of Gamera's data type Point, but a list or tuple of numbers representing the coordinates. point must be of the same dimension as the kd-tree.
The result is a list of nodes ordered by distance from point,i.e. the closest node is the first. If your query point happens to coincide with a node, you can skip it by simply removing the first entry from the result list.
The optional parameter predicate is a function or callable class that takes a KdNode as argument and returns False when this node shall not be among the returned neighbors.
The module gamera.kdtree is only a thin Python wrapper around a C++ class KdTree. This can also be used directly in C++ plugins.
The header file kdtree.hpp declares the necessary structures in the namespace Gamera::Kdtree. It is installed with the other gamera header files, and can thus be included with
#include "geostructs/kdtree.hpp"
using namespace Gamera::Kdtree;
The tricky part is getting your plugin module to be linked with the actual kdtree implementation. This is achieved by adding the source file kdtree.cpp to the cpp_sources property in the plugin Python interface.
In the gamera core code, the following works:
class ExampleModule(PluginModule):
category = "MyPlugins"
cpp_headers = ["myplugins.hpp"]
cpp_sources = ["gamera/src/geostructs/kdtree.cpp"]
functions = [myplugin1, myplugin2]
module = ExampleModule()
In a toolkit, this will not work, because the path names in the cpp_sources property are relative to the location of the setup.py script. To allow for the use of the KdTree C++ class in toolkits, the source file kdtree.cpp is installed together with Gamera. You can thus specify this file in cpp_sources as follows:
class ExampleModule(PluginModule):
# ...
import os, sys, gamera
gamera_root = os.path.dirname(gamera.__file__)
cppfile = os.path.join(gamera_root,"gamera/src/geostructs/kdtree.cpp")
if not os.path.exists(cppfile):
gamera_root = os.path.join(sys.exec_prefix,"gamera")
cppfile = os.path.join(gamera_root,"gamera/src/geostructs/kdtree.cpp")
cpp_sources=[cppfile]
# ...
Querying the installation directory is a bit tricky, because the python setuptools do not install additional data_files in a predictable way: they might go into the gamera installation directory, which can be found out through the __file__ property of the gamera module, or they might go into sys.exec_prefix/gamera (this is what the setuptools documentation says, but apparently this does not hold on all platforms).
For normal use of the kdtree, you will need the classes CoordPoint (a typedef for vector<double>), KdNode, KdNodeVector, and KdTree. Beside the property point, a KdNode can also store an arbitrary pointer as data. See the header file kdtree.hpp for details.
Here is a usage example for a nearest neighbor search:
// set points for building the tree
KdNodeVector nodes;
double points[][2] = {
{1,4}, {2,4}, {1,5}, {3,6}, {8,9},
{2,7}, {4,4}, {5,5}, {4,6}, {8,3},
{-20,-20} // array terminator
};
size_t i;
for (i=0; points[i][0]>=-1.0; i++) {
CoordPoint p;
p.push_back(points[i][0]);
p.push_back(points[i][1]);
nodes.push_back(KdNode(p));
}
// build the tree
KdTree tree(&nodes);
// find the three nearest neighbors to (5,6)
KdNodeVector neighbors;
CoordPoint point(2);
point[0] = 5;
point[1] = 6;
tree.k_nearest_neighbors(point, 3, &neighbors);
If you want to restrict the returned neighbors to only those fulfilling a specific search predicate, you can define your search predicate as a callable class (aka functor, i.e. a class with the call operator operator() overwritten [Stroustrup1997]) derived from KdNodePredicate as in the following example:
// only search for nodes on the right hand side of search point
struct MyPredicate : public KdNodePredicate {
CoordPoint point(2);
MyPredicate(CoordPoint& p) {
point = p;
}
bool operator()(const KdNode& kn) const {
return point[0] < kn.point[0];
}
};
The call of the class only takes a single argument, the KdNode, and returns true when this node is an admissible search result. If this criterion depends on some information from the actual search point (like its x-component in the exampel above), this information must be passed to the class constructor and stored within the class.
An instance of this class can then be passed as the third, optional argument to KdTree.k_nearest_neighbors:
KdNodeVector neighbors;
CoordPoint point(2);
point[0] = 5;
point[1] = 6;
MyPredicate predicate(point);
tree.k_nearest_neighbors(point, 3, &neighbors, &predicate);
[deBerg2000] | M. de Berg, M. van Kreveld, M. Overmars, O. Schwarzkopf: Computational Geometry. Second edition, Springer (2000) |
[Bentley1975] | (1, 2) J.L. Bentley: Multidimensional Binary Search Trees Used for Associative Searching. Communications of the ACM 18, pp. 509-517 (1975) |
[Friedman1977] | J.H. Friedman, J.L. Bentley, R.A. Finkel: An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transcations on Mathematical Software 3, pp. 209-226 (1977) |
[Stroustrup1997] | Stroustrup, B. 1997. The C++ Programming Language: Third Edition. Reading, MA: Addison-Wesley. |
[Dalitz09] | C. Dalitz: Kd-Trees for Document Layout Analysis. In C. Dalitz (Ed.): "Document Image Analysis with the Gamera Framework." Schriftenreihe des Fachbereichs Elektrotechnik und Informatik, Hochschule Niederrhein, vol. 8, pp. 39-52, Shaker Verlag (2009) |