pyANN

pyANN provides Python bindings for David Mount and Sunil Arya's ANN library for approximate nearest neighbor searching. Like ANN, pyANN is licensed under LGPL.

Example

>>> import pyANN
>>> kdtree = pyANN.KDTree([[1,1,1], [1,2,3], [4,5,6]])
>>> kdtree.kSearch([0,0,0], 2, 0)
([0, 1], [3.0, 14.0])

Download

The easiest way to obtain pyANN is to check it out from my Mercurial repository:

hg clone http://hg.mrzv.org/pyANN

Alternatively, one can download the tarball.

Building and Installing

First of all, one needs to patch ANN to make it compile with more recent versions of GCC and to make it build shared libraries under Linux. If you are not under Linux and have an older version of GCC (below 4.3), you can safely skip these steps.

Download the gcc43.patch and shared-libs.patch. Within the directory where you've extracted ANN, run:

patch -p1 < .../gcc43.patch
patch -p1 < .../shared-libs.patch
make linux-g++-sl

(assuming you are on Linux).

Note

One can get a distribution of ANN with the above patches applied as well as other changes (e.g. reentrant annkSearch() method) from my repository:

hg clone http://hg.mrzv.org/ANN

Besides ANN, pyANN depends on Boost.Python. To compile and install the bindings, if you have waf installed, run:

waf configure
waf build
waf install

Otherwise, there is the original simple Makefile, which you can tweak (e.g., adjust paths), and then make will compile the bindings. You will have to manually put them somewhere on your PYTHONPATH.

KDTree class

The main content of pyANN is the class KDTree. The class is initialized with a list of points, and provides three key methods:

__init__(lst)
Initialize KDTree with a list of points (each point is a list of coordinates).
kSearch(q,k,eps)
Find k nearest neighbors of the query point q with an error of eps. Returns a pair of lists: (idxs, dists). The first is the list of indices of the nearest neighbors; the second is the list of squared distances to the corresponding points from the query point.
kPriSearch(q,k,eps)
Same as above using priority search.
kFRSearch(q, sqRad, k, eps)
Fixed radius search. Find at most k nearest neighbors of the query point q within radius sqRad, with an allowed error of eps. Returns (idx,dists,k); the first two are the same as above, and the last is the number of neighbors found.
__len__()
Returns number of points in the KDTree.
dim()
Dimensions of the point set.

Additional auxiliary (global) function mimicks ANN's functionality:

max_pts_visit(maxPts)
Maximum number of points to visit before terminating (will override larger values of k in the above search functions).