J. Feldman and R. O'Donnell and R. Servedio.

We propose and analyze a new vantage point for the learning of mixtures of Gaussians: namely, the PAC-style model of learning probability distributions introduced by Kearns~et~al.~\cite{KMR+:94}. Here the task is to construct a hypothesis mixture of Gaussians that is statistically indistinguishable from the actual mixture generating the data; specifically, the KL~divergence should be at most $\eps$.

In this scenario, we give a $\poly(n/\eps)$ time algorithm that learns the class of mixtures of any constant number of axis-aligned Gaussians in $\R^n$. Our algorithm makes \emph{no} assumptions about the separation between the means of the Gaussians, nor does it have any dependence on the minimum mixing weight. This is in contrast to learning results known in the ``clustering'' model, where such assumptions are unavoidable.

Our algorithm relies on the method of moments, and a subalgorithm developed in~\cite{FOS:05} for a discrete mixture-learning problem.

Postscript or pdf (full version).