#
Boosting and Naive Bayesian Learning

###
Abstract

So-called "naive" Bayesian (NB) learning makes the unrealistic assumption
that the values of the attributes of an example are independent given the
class of the example. Nevertheless, this learning method is remarkably
successful in practice, and no uniformly better learning method is known.
Boosting is a general method of combining multiple classifiers due to Yoav
Freund and Rob Schapire. This talk shows that boosting applied to NB classifiers
yields combination classifiers that are representationally equivalent to
standard feedforward multilayer perceptrons. A further result is that NB
learning is a nonparametric, nonlinear generalization of logistic regression.
Despite the representational similarity, as a training algorithm, boosted
naive Bayesian (BNB) learning is quite different from backpropagation.
BNB has definite advantages. It requires only linear time and constant
space, and "hidden" nodes are learned incrementally, starting with the
most important. On the real-world datasets on which the method has been
tried so far, generalization performance is as good as or better than the
best published result using any other learning method, and BNB was the
winner of the data mining competition at the recent KDD conference.

Unlike backpropagation and other standard learning methods, NB learning
can be done in logarithmic time with a linear number of processors. Accordingly,
it is more plausible neurocomputationally than other methods as a model
of animal learning. A review of experimentally established phenomena of
primate supervised learning leads to the conclusion that NB learning is
also more plausible behaviorally, since only it exhibits the same phenomena.

*Luis Gravano*

*gravano@cs.columbia.edu*