The problem of finding the closest point in high-dimensional spaces is
common in pattern recognition. Unfortunately, the complexity of most existing
search algorithms, such as k-d tree and R-tree, grows exponentially with
dimension, making them impractical for dimensionality above 15 or so. In nearly
all applications, the closest point is of interest only if it lies within a
user-specified distance e. We have developed a simple and practical algorithm
to efficiently search for the nearest neighbor within Euclidean distance e. The
use of projection search combined with a novel data structure dramatically
improves performance in high dimensions. A complexity analysis has been done
which aids in automatically determining e in structured problems. A
comprehensive set of benchmarks clearly demonstrates the benefits of the
proposed algorithm for a variety of structured and unstructured search
problems. Object recognition by appearance matching is demonstrated as an
example application. The simplicity of the algorithm makes it possible to
construct an inexpensive hardware search engine which can be at least 100 times
faster than its software equivalent. A C++ implementation of our algorithm has
been made available. |