First, I consider the problem of using neural networks (radial basis function networks) for pattern classification and regression. I investigate the number of parameters and the number of examples that we need in order to achieve a certain generalization error with prescribed confidence. The generalization error is due in part to the representational inadequacy (finite number of parameters) and in part due to the informational inadequacy (finite number of examples). We bound each of these two contributions. In doing so, we characterize a) the inherent tension between these two forms of error: attempting to reduce one, increases the other b) the class of problems effectively solved by regularization networks c) how to choose an appropriately sized network for such a class of problems.
Next, I consider the problem of human language acquisition within the principles and parameters framework of Chomsky. I'll show how language acquisition reduces to parameter estimation within this framework and analyze the informational complexity of such parameter estimation problems in certain situations. I will comment on similarities between the two learning problems and various related issues including the paradoxical problem of language change.