What is a Good Nearest Neighbors Algorithm for Finding Similar Patches in Images? |
 | |
Finding similar patches in images is a critical, yet expensive, step in a
wide variety of computer vision tasks, including object and shape recognition,
texture synthesis, and image denoising. For an image with N patches, an
exhaustive search by scanning through the entire image for all N patches
takes O(N^2) time, which can take several minutes or even hours.
However, by treating each image patch as a point in a high-dimensional space,
we can use a Nearest Neighbors (NN) algorithm to compute the exact same results
in a fraction of the time. Although a large number of general-purpose NN
algorithms have been developed over the years, applying them directly to image
search will not achieve the best results -- images have distinct structure and
unique properties that can be taken advantage of, to greatly improve search
performance.
In this work, we optimize many NN approaches specifically for finding
similar patches in images and experimentally evaluate them to compare their
performance. The following are the main contributions of our work:
- We describe several NN methods, some of which have not been widely
used in computer vision. We restrict ourselves to exact approaches which
guarantee that all nearest neighbors are found -- approximate methods are not discussed.
- We use several techniques to significantly improve the efficiencies of the
different NN algorithms (in some cases exploiting the inherent properties of
images). These include precomputing or using look-up tables for computing
distances, as well as using priority queues for certain types of searches.
- We evaluate the off-line (construction) and on-line (search) performances
of the different methods using a set of real images. We study the behaviors of
the algorithms with respect to various parameters, including patch size, image
size, number of nearest neighbors, search range, and number of input images.
We find that the vantage point tree, which has not been widely used in
vision, has the best overall performance.
|
Publications
"What is a Good Nearest Neighbors Algorithm for Finding Similar Patches in Images?," N. Kumar, L. Zhang, and S. K. Nayar, European Conference on Computer Vision (ECCV), pp.364-378, Oct, 2008. [PDF] [bib] [©]
|
Images
 |
|
Finding Similar Patches:
The closest neighbors of the patches outlined in red are shown in blue.
The matching eyes in (a) took 2125 ms to compute using an exhaustive search,
but only 1.25 ms (1700X faster) using our image-optimized implementation of the
vp-tree, one of the methods discussed in this paper. Similarly, finding
the nosetips in (b) took 1375 ms using brute force, but only 3.43 ms (401X
faster) using the vp-tree.
|
| |
|
|
 |
|
Illustration of Various Nearest Neighbors Methods:
This figure illustrates the partitioning of a 2D point set using different
types of nearest neighbor trees, all with a maximum leaf size of 1 and a
branching factor of 2. Line thickness denotes partition order (thicker lines
were partitioned first). Note the very different structures created by the
methods, which result in very different search speeds.
|
| |
|
|
 |
|
The Image Dataset used for Evaluation:
This image shows thumbnails of the 15 images used to evaluate the various
nearest neighbors algorithms. They were chosen to represent a wide variety of
image types, e.g., indoor, outdoor, portraits, etc. Note that even with this
small number of test images, there are still millions of patches to search
through.
|
| |
|
|
 |
|
Comprehensive Results:
This table shows comprehensive results for a variety of tree types. The splits
are given as branching factors for ball trees, k-means, kd-trees,
and vp-trees (k), and as bin sizes for PCA trees and
vp-trees (d). Construction costs are given as average number of
distance calculations per patch. Search speeds are given as improvement over
brute-force search. Optimal values are shown in bold.
|
| |
|
|
 |
|
Summary of Construction and Search Performance:
Summaries of the (a) construction costs and (b) search performance for
different types of NN trees. The construction costs are given as average number
of distance computations per input patch, plotted on a logarithmic scale. The
search speeds are given as ratio of time taken for brute force scan versus time
taken using the given tree type. All numbers are for best parameter settings.
Note that although ball trees and vp-trees have similarly good search
performance, the latter has a much smaller construction cost.
|
| |
|
|
 |
|
Performance Along Various Dimensions:
Search performance of the vp-tree plotted as a function of (a) search
distances in log-log, for (top) e-NN and (bottom) k-NN searches;
(b) patch size in semi-log; and (c) number of random input images (top) and video
frames (bottom). See figure for details.
|
| |
|
|
 |
|
Summary of Results:
This table show a summary of the results for each tree type. The
vp-tree performs well in all respects.
|
| |
|
|
|
Nearest Neighbor Search
Appearance Matching
|
|
|