What is a Good Nearest Neighbors Algorithm for Finding Similar Patches in Images?

Finding similar patches in images is a critical, yet expensive, step in a wide variety of computer vision tasks, including object and shape recognition, texture synthesis, and image denoising. For an image with N patches, an exhaustive search by scanning through the entire image for all N patches takes O(N^2) time, which can take several minutes or even hours. However, by treating each image patch as a point in a high-dimensional space, we can use a Nearest Neighbors (NN) algorithm to compute the exact same results in a fraction of the time. Although a large number of general-purpose NN algorithms have been developed over the years, applying them directly to image search will not achieve the best results -- images have distinct structure and unique properties that can be taken advantage of, to greatly improve search performance.

In this work, we optimize many NN approaches specifically for finding similar patches in images and experimentally evaluate them to compare their performance. The following are the main contributions of our work:

  • We describe several NN methods, some of which have not been widely used in computer vision. We restrict ourselves to exact approaches which guarantee that all nearest neighbors are found -- approximate methods are not discussed.
  • We use several techniques to significantly improve the efficiencies of the different NN algorithms (in some cases exploiting the inherent properties of images). These include precomputing or using look-up tables for computing distances, as well as using priority queues for certain types of searches.
  • We evaluate the off-line (construction) and on-line (search) performances of the different methods using a set of real images. We study the behaviors of the algorithms with respect to various parameters, including patch size, image size, number of nearest neighbors, search range, and number of input images. We find that the vantage point tree, which has not been widely used in vision, has the best overall performance.


"What is a Good Nearest Neighbors Algorithm for Finding Similar Patches in Images?,"
N. Kumar, L. Zhang, and S. K. Nayar,
European Conference on Computer Vision (ECCV),
pp. 364-378, Oct. 2008.
[PDF] [bib] [©]


  Finding Similar Patches:

The closest neighbors of the patches outlined in red are shown in blue. The matching eyes in (a) took 2125 ms to compute using an exhaustive search, but only 1.25 ms (1700X faster) using our image-optimized implementation of the vp-tree, one of the methods discussed in this paper. Similarly, finding the nosetips in (b) took 1375 ms using brute force, but only 3.43 ms (401X faster) using the vp-tree.

  Illustration of Various Nearest Neighbors Methods:

This figure illustrates the partitioning of a 2D point set using different types of nearest neighbor trees, all with a maximum leaf size of 1 and a branching factor of 2. Line thickness denotes partition order (thicker lines were partitioned first). Note the very different structures created by the methods, which result in very different search speeds.

  The Image Dataset used for Evaluation:

This image shows thumbnails of the 15 images used to evaluate the various nearest neighbors algorithms. They were chosen to represent a wide variety of image types, e.g., indoor, outdoor, portraits, etc. Note that even with this small number of test images, there are still millions of patches to search through.

  Comprehensive Results:

This table shows comprehensive results for a variety of tree types. The splits are given as branching factors for ball trees, k-means, kd-trees, and vp-trees (k), and as bin sizes for PCA trees and vp-trees (d). Construction costs are given as average number of distance calculations per patch. Search speeds are given as improvement over brute-force search. Optimal values are shown in bold.

  Summary of Construction and Search Performance:

Summaries of the (a) construction costs and (b) search performance for different types of NN trees. The construction costs are given as average number of distance computations per input patch, plotted on a logarithmic scale. The search speeds are given as ratio of time taken for brute force scan versus time taken using the given tree type. All numbers are for best parameter settings. Note that although ball trees and vp-trees have similarly good search performance, the latter has a much smaller construction cost.

  Performance Along Various Dimensions:

Search performance of the vp-tree plotted as a function of (a) search distances in log-log, for (top) e-NN and (bottom) k-NN searches; (b) patch size in semi-log; and (c) number of random input images (top) and video frames (bottom). See figure for details.

  Summary of Results:

This table show a summary of the results for each tree type. The vp-tree performs well in all respects.


Nearest Neighbor Search

Appearance Matching