Geometric Generalizations of the Power of Two Choices
MICHAEL MITZENMACHER
(Joint work with John Byers and Jeffery Considine.)
A well-known paradigm for load balancing in distributed systems is the
``power of two choices,'' whereby an item is stored at the less loaded
of two (or more) random alternative servers. We investigate the power
of two choices in natural settings for distributed computing where
items and servers reside in a geometric space and each item is
associated with the server that is its nearest neighbor. This is in
fact the backdrop for distributed hash tables such as Chord, where the
geometric space is determined by clockwise distance on a
one-dimensional ring. Theoretically, we consider the following load
balancing problem. Suppose that servers are initially hashed
uniformly at random to points in the space. Sequentially, each item
then considers $d$ candidate insertion points also chosen uniformly at
random from the space, and selects the insertion point whose
associated server has the least load. For the one-dimensional ring,
and for Euclidean distance on the two-dimensional torus, we
demonstrate that when $n$ data items are hashed to $n$ servers, the
maximum load at any server is $\log \log n/ \log d + O(1)$ with high
probability. While our results match the well-known bounds in the
standard setting in which each server is selected equiprobably, our
applications do not have this feature, since the sizes of the
nearest-neighbor regions around servers are non-uniform. Therefore,
the novelty in our methods lies in developing appropriate tail bounds
on the distribution of nearest-neighbor region sizes and in adapting
previous arguments to this more general setting.