COMS 4252: Introduction to Computational Learning Theory
Fall 2009

SYLLABUS

General Information | Introduction | Topics | Prerequisites | Grading | Readings | Problem Sets | Final Projects | Schedule of Topics | Lecture Notes

General Information

Instructor: Rocco Servedio. Email: rocco-at-cs-dot-columbia-dot-edu. Office: 517 CSB. Office hours: Tues 12:15-2:15.
Teaching assistants: Li-Yang Tan (Email: liyang-at-cs-dot-columbia-dot-edu; Office: 511 CSB; Office hours: Tues 10-11, Thurs 10-11) and Andrew Wan (Email: atw12-at-columbia-dot-edu; Office: 516 CSB; Office hours: Wed 11:00-12:00,1:00-2:00). Classroom: 535 Mudd
Time: Tues/Thurs 11:00-12:15.
Course website (this page): http://www.cs.columbia.edu/~cs4252/

Introduction

The question "Can machines learn from experience?" is one that has fascinated people for a long time. Over the past few decades, many researchers in computer science have studied this question from a range of applied and theoretical perspectives. This course will give an introduction to some of the central topics in computational learning theory, a field which studies the above question from a theoretical computer science perspective. We will study well-defined mathematical models of learning in which it is possible to give precise and rigorous analyses of learning problems and learning algorithms. A big focus of the course will be the computational efficiency of learning in these models. We'll develop computationally efficient algorithms for certain learning problems, and will see why efficient algorithms are not likely to exist for other problems.

List of Topics

This is a preliminary list of core topics. Other topics may be covered as well depending on how the semester progresses. Most topics will take several lectures.

Prerequisites

You should be familiar with the following topics:

Some basic knowledge of the theory of NP-completeness (e.g. Chapter 7 of Sipser or Chapter 34 of CLRS) and of elementary cryptography (e.g. pages 881-887 of CLRS) would be helpful but is not required -- the course will be completely self-contained with respect to these topics.

Grading and Requirements

The requirements of the course are as follows:

Homework policy: You are encouraged to discuss the course material and the homework problems with each other in small groups (2-3 people), but you must list all discussion partners on your problem set. Discussion of homework problems may include brainstorming and verbally walking through possible solutions, but should not include one person telling the others how to solve the problem.  In addition, each person must write up their solutions independently;  you may not look at another student's written solutions. You may consult outside materials, but all materials used must be appropriately acknowledged, and you must always write up your solutions in your own words. Violation of this policy will result in a penalty to be assessed at the instructor's discretion. This may include receiving a zero grade for the assignment in question AND a failing grade for the whole course, even for the first infraction.

Students are expected to adhere to the Academic Honesty policy of the Computer Science Department; this policy can be found in full here. Please contact the instructor with any questions.

Re-grade policy: If you dispute the grade received for an assignment, you must submit, in writing (Latex, not email), your detailed and clearly stated argument for what you believe is incorrect and why. This must be submitted no later than the beginning of class 1 week after the assignment was returned. (For example, if the assignment was returned to the class on Tuesday, your write-up is due by beginning of class on the next Tuesday). Requests for re-grade after this time will not be accepted. A written response will be provided within one week indicating your final score. Requests of re-grade of a specific problem may result in a regrade of the entire assignment. This re-grade and written response is final; no additional re-grades or debate for that assignment.

Readings

The textbook for this course is:

This book is available for purchase on-line and at the Columbia Bookstore; it's also available for reserve in the engineering library, but you will likely want to have your own copy. It's an excellent book, but several topics which we'll cover (particularly early in the semester) are not in the book. A good source for material which we will cover in the first few weeks of class are the following two papers:

For the unit on boosting, a good reference is Rob Schapire's survey paper "The Boosting approach to machine learning: An overview". The first three sections give a (very condensed) treatment of the AdaBoost algorithm we'll cover in class, and later sections have good overviews of more advanced material (this is a good starting point for a project on boosting). Of course, the original paper on Adaboost, "A decision-theoretic generalization of on-line learning and an application to boosting", is a good sources as well; you can find it near the bottom of this page . Also, click here for a handy copy of the AdaBoost algorithm.

Problem Sets

Click here to get to the homework page.

Final Projects

Click here for more information about final projects.

Schedule of Topics

Here is an anticipated schedule of lectures:

Lecture Notes

Click here to obtain SmartBoard notes from lectures.