A. De and I. Diakonikolas and R. Servedio.

In

This paper studies the problem of learning ``low-complexity" probability distributions over the Boolean hypercube $\{-1,1\}^n$. As in the standard PAC learning model, a learning problem in our framework is defined by a class ${\cal C}$ of Boolean functions over $\{-1,1\}^n$, but in our model the learning algorithm is given uniform random satisfying assignments of an unknown $f \in \calC$ and its goal is to output a high-accuracy approximation of the uniform distribution over $f^{-1}(1).$ This distribution learning problem may be viewed as a demanding variant of standard Boolean function learning, where the learning algorithm only receives positive examples and --- more importantly --- must output a hypothesis function which has small \emph{multiplicative} error (i.e. small error relative to the size of $f^{-1}(1)$).

As our main results, we show that the two most widely studied classes of Boolean functions in computational learning theory --- linear threshold functions and DNF formulas --- have efficient distribution learning algorithms in our model. Our algorithm for linear threshold functions runs in time poly$(n,1/\eps)$ and our algorithm for polynomial-size DNF runs in time quasipoly$(n,1/\eps)$. We obtain both these results via a general approach that combines a broad range of technical ingredients, including the complexity-theoretic study of approximate counting and uniform generation; the Statistical Query model from learning theory; and hypothesis testing techniques from statistics. A key conceptual and technical ingredient of this approach is a new kind of algorithm which we devise called a ``densifier'' and which we believe may be useful in other contexts.

We also establish limitations on efficient learnability in our model by showing that the existence of certain types of cryptographic signature schemes imply that certain learning problems in our framework are computationally hard. Via this connection we show that assuming the existence of sufficiently strong unique signature schemes, there are no sub-exponential time learning algorithms in our framework for intersections of two halfspaces, for degree-2 polynomial threshold functions, or for monotone 2-CNF formulas. Thus our positive results for distribution learning come close to the limits of what can be achieved by efficient algorithms.

pdf of conference version

pdf of full version

Latvian translation of this page courtesy of Simona Auglis

Polish translation of this page courtesy of Nadia Karbowska