A major open problem in the theory of machine learning is to develop an efficient algorithm for learning Boolean formulas in Disjunctive Normal Form (DNF). This thesis reports progress towards such an algorithm. We derive a new algebraic characterization of DNF formulas and use this characterization to obtain a DNF learning algorithm which is substantially faster than the best previous approaches. The running time of our algorithm is exponential in the cube root of the number of variables in the DNF formula, as opposed to previous bounds which were exponential in the square root of the number of variables.
The two most important resources for a learning algorithm are computation time and input data; an ideal learning algorithm would simultaneously optimize its use of both resources. Using techniques from cryptography, we show that even very simple learning problems can exhibit strong inherent tradeoffs between these two resources. We describe a class of decision lists and prove that these structures can be learned in polynomial time from a large data set but cannot be learned in polynomial time using a minimal (constant size) data set. These results establish that for some natural learning problems ``ideal'' learning algorithms simultaneously optimizing both resources cannot exist.
Perceptron and Winnow are two classical algorithms for learning a linear separator which have some remarkable properties. We demonstrate a close connection between boosting, a learning technique which has gained wide use in recent years, and these algorithms for learning linear classifiers. Using boosting we construct a new family of learning algorithms for linear classifiers which closely match the sample complexity and noise tolerance of algorithms such as Perceptron and Winnow. We thus help unify the seemingly disparate topics of boosting and these classical learning algorithms.
We also present new algorithms which give quantitative performance improvements for a range of well-studied problems in learning theory. We present a new boosting algorithm which is guaranteed to use intermediate probability distributions which are optimally smooth. We use this smooth boosting algorithm to give the fastest known algorithm for learning DNF under the uniform distribution using membership queries, and also to improve on known hard-core set constructions in complexity theory. We also describe a fast, noise tolerant algorithm for learning linear classifiers under symmetric distributions. Finally, we give the fastest known algorithm for learning monotone DNF under the uniform distribution without membership queries.
Postscript or pdf.
Back to main papers page