Learning mixtures of product distributions over discrete domains

Learning mixtures of product distributions over discrete domains.
J. Feldman and R. O'Donnell and R. Servedio.
SIAM Journal on Computing, 37(5), 2008, pp. 1536--1564.
Preliminary version in 46th Symposium on Foundations of Computer Science (FOCS) , 2005, pp. 501--510.

Abstract:

We consider the problem of learning {\em mixtures of product distributions over discrete domains} in the distribution learning framework introduced by Kearns et al.~\cite{KMR+:94}. We give a $\poly(n/\eps)$ time algorithm for learning a mixture of $k$ arbitrary product distributions over the $n$-dimensional Boolean cube $\{0,1\}^n$ to accuracy $\eps$, for any constant $k$. Previous polynomial time algorithms could only achieve this for $k = 2$ product distributions; our result answers an open question stated independently in~\cite{Cryan:99} and~\cite{FreundMansour:99}. We further give evidence that no polynomial time algorithm can succeed when $k$ is superconstant, by reduction from a notorious open problem in PAC learning. Finally, we generalize our $\poly(n/\eps)$ time algorithm to learn any mixture of $k = O(1)$ product distributions over $\{0,1, \dots, b\}^n$, for any $b = O(1)$.

pdf of conference version

pdf of journal version

Back to main papers page