Learning mixtures of structured distributions over discrete domains
S. Chan and I. Diakonikolas and R. Servedio and X. Sun.
In ACM-SIAM Symposium on Discrete Algorithms (SODA), 2013, pp. 1380--1394.

Abstract:

Let $\mathfrak{C}$ be a class of probability distributions over the discrete domain $[n] = \{1,\dots,n\}.$ We show that if $\mathfrak{C}$ satisfies a rather general condition -- essentially, that each distribution in $\mathfrak{C}$ can be well-approximated by a variable-width histogram with few bins -- then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of $k$ unknown distributions from $\mathfrak{C}.$

We analyze several natural types of distributions over $[n]$, including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems as described below. More precisely,

• Log-concave distributions: We learn any mixture of $k$ log-concave distributions over $[n]$ using $k \cdot \tilde{O}(1/\eps^4)$ samples (independent of $n$) and running in time $\tilde{O}(k \log(n) / \eps^4)$ bit-operations (note that reading a single sample from $[n]$ takes $\Theta(\log n)$ bit operations). For the special case $k=1$ we give an efficient algorithm using $\tilde{O}(1/\eps^3)$ samples; this generalizes the main result of \cite{DDS12stoc} from the class of Poisson Binomial Distributions to the much broader class of all log-concave distributions. Our upper bounds are not far from optimal since any algorithm for this learning problem requires $\Omega(k/\eps^{5/2})$ samples.

• Monotone hazard rate (MHR) distributions: We learn any mixture of $k$ MHR distributions over $[n]$ using $O(k \log (n/\eps)/\eps^4)$ samples and running in time $\tilde{O}(k \log^2(n) / \eps^4)$ bit-operations. Any algorithm for this learning problem must use $\Omega(k \log(n)/\eps^3)$ samples.

• Unimodal distributions: We give an algorithm that learns any mixture of $k$ unimodal distributions over $[n]$ using $O(k \log (n)/\eps^{4})$ samples and running in time $\tilde{O}(k \log^2(n) / \eps^{4})$ bit-operations. Any algorithm for this problem must use $\Omega(k \log(n)/\eps^3)$ samples.