COMS 4252:
Introduction to Computational Learning Theory
Fall 2009
SYLLABUS
General Information |
Introduction | Topics |
Prerequisites
| Grading | Readings
| Problem Sets
| Final Projects
| Schedule of Topics
| Lecture Notes
General Information
Instructor: Rocco Servedio. Email: rocco-at-cs-dot-columbia-dot-edu. Office: 517 CSB. Office hours: Tues 12:15-2:15.
Teaching assistants:
Li-Yang Tan (Email: liyang-at-cs-dot-columbia-dot-edu; Office: 511 CSB;
Office hours: Tues 10-11, Thurs 10-11) and Andrew Wan
(Email: atw12-at-columbia-dot-edu; Office: 516 CSB; Office hours: Wed 11:00-12:00,1:00-2:00).
Classroom: 535 Mudd
Time: Tues/Thurs 11:00-12:15.
Course website (this page): http://www.cs.columbia.edu/~cs4252/
Introduction
The question "Can machines learn from experience?"
is one that has fascinated people for a long time.
Over the past few decades, many researchers in computer science
have studied this question from a range of applied and
theoretical perspectives.
This course will give an introduction to some of the central
topics in computational learning theory, a field
which studies the above question from a theoretical computer science
perspective. We will study
well-defined mathematical models of learning in which it is possible
to give precise and rigorous analyses of learning problems and
learning algorithms.
A big focus of the course will be the computational efficiency
of learning in these models. We'll develop computationally
efficient algorithms for certain learning problems, and will see
why efficient algorithms are not likely to exist for other
problems.
List of Topics
This is a preliminary list of core topics.
Other topics may be covered as well depending on how
the semester progresses. Most topics will take several lectures.
-
Introduction: what is computational learning theory (and why)?
-
The online mistake-bound learning model. Online algorithms for simple concept
classes.
-
General algorithms for online learning. Lower bounds for
online learning.
-
The Probably Approximately Correct (PAC) learning model: definition
and examples. Online to PAC conversions.
- Occam's Razor: learning by finding a consistent hypothesis.
-
The VC dimension and uniform convergence.
-
Weak versus strong learning: boosting algorithms.
-
Computational hardness results for efficient learning based on cryptography.
-
PAC learning from noisy data. Malicious noise and
random classification noise. Learning from Statistical Queries.
-
Exact learning from membership and equivalence queries.
Learning monotone DNF and learning finite automata.
Prerequisites
You should be familiar with the following topics:
-
Important: Big-O notation and basic analysis of algorithms.
Chapter 3 in "Introduction to Algorithms" by
Cormen, Leicerson, Rivest and Stein is more than sufficient.
-
Important: Basic discrete math and probability. The course
Computer Science 3203 and the "Mathematical Background" section
(Appendix VIII) of CLRS are good sources here.
-
Important: You should be comfortable with
reading and writing proofs.
Chapter 0 of Sipser's book "Introduction to the Theory of Computation"
is a good source here.
-
Less important but helpful:
Basic knowledge of finite automata would be helpful but isn't
absolutely required.
Chapter 1 of Sipser's book
"Introduction to the Theory of Computation," or the coverage
of finite automata in CS 3261, is more than sufficient.
Some basic knowledge of the theory of NP-completeness (e.g. Chapter 7 of
Sipser or Chapter 34 of CLRS) and of elementary cryptography
(e.g. pages 881-887 of CLRS) would be helpful but is not required --
the course will be completely self-contained with respect to these topics.
Grading and Requirements
The requirements of the course are as follows:
-
Biweekly problem sets (70% of grade). Your solutions must be typed
using LaTeX (instructions will be given on how to do this.)
The biweekly problem sets will be due on Thursdays by 11:00am (start of class);
you must bring a hard copy to class and give it to the TA.
You are allowed six late days for the semester.
Each late day is exactly 24 hours; late days
cannot be subdivided -- five minutes late counts as one late day.
Note also that problem sets cannot be subdivided with respect to late days
(i.e. you can't use 1/6 of a late day on a 6-problem problem set by
handing in one problem one day late.)
Late days over and beyond the allotted six late days will be
penalized at a rate of 10% per late day (with the same system in place;
five minutes late counts as one late day).
For
an exception, you must have your undergraduate advisor (for undergrads) or your
graduate advisor (for graduate students) contact me.
Some problems
will be challenging; you are advised to start the problem
sets early. Late problem sets should be emailed to the TAs.
-
Final project (30% of grade).
You may do an experimental project or a theoretical one. An experimental
project might involve implementing and testing some learning algorithm.
A theoretical project might involve reading one or more research papers
to get a more in-depth understanding of some topic we touched on in class,
or some other topic that you're interested in.
For either type of project, you'll give a brief presentation
and submit a written report. Talk to me before settling on a project topic.
Final Projects will be due in December (exact date TBA, but it will be
around the last day of classes).
-
Classroom Participation.
Students are expected to come to class and participate. In cases
where a student's grade is on the borderline between two grades,
coming to class and making a contribution through participation can
make the difference.
Homework policy:
You are encouraged to discuss the course material and the homework
problems
with each other in small groups (2-3 people), but you must list all
discussion
partners on your problem set.
Discussion of homework problems may include
brainstorming and verbally walking through possible solutions, but should
not include one person telling the others how to solve the problem.
In addition, each person must write up their solutions
independently;
you may not look at another student's written solutions.
You may consult outside materials, but all materials used must be
appropriately acknowledged, and you must always write up your
solutions in your own words. Violation of this policy will result
in a penalty to be assessed at the instructor's discretion. This may include
receiving a zero grade for the assignment in question AND a failing grade
for the whole course, even for the first infraction.
Students are expected to adhere to the Academic Honesty policy of the
Computer Science Department; this policy can be found in full
here.
Please contact the instructor with any questions.
Re-grade policy:
If you dispute the grade received for an assignment, you must
submit, in writing (Latex, not email), your detailed and clearly stated
argument for what you believe is incorrect and why. This must be submitted
no later than the beginning of class 1 week after the assignment was returned.
(For example, if the assignment was returned to the class on Tuesday, your
write-up is due by beginning of class on the next Tuesday). Requests for
re-grade after this time will not be accepted. A written response will be
provided within one week indicating your final score.
Requests of re-grade of a specific problem may
result in a regrade of the entire assignment. This re-grade and written response is final; no additional re-grades or debate for that assignment.
Readings
"I therefore perused this Sheet with wonderful
Application, and in about a Day's time discovered that I could not
understand it." -- Henry Fielding, A Journey from This
World to the Next
The textbook for this course is:
This book is available for purchase on-line and at the Columbia Bookstore;
it's also available for reserve in the engineering library, but you will
likely want to have your own copy.
It's an excellent book, but several topics which we'll cover
(particularly early in the
semester) are not in the book.
A good source for material which we will cover in the first few weeks
of class are the following two papers:
- Avrim Blum's survey paper
"Online Algorithms in Machine Learning"
This has an overview of the online mistake-bound learning model, the Weighted
Majority algorithm, the online decision list learning algorithm, and much
more.
For the unit on boosting, a good reference is Rob Schapire's survey paper
"The Boosting approach to machine learning: An overview".
The first three sections give a (very condensed) treatment of the
AdaBoost algorithm we'll cover in class, and later sections have good
overviews of more advanced material (this is a good starting point
for a project on boosting). Of course, the original paper on
Adaboost, "A decision-theoretic generalization of on-line learning and
an application to boosting", is a good sources as well; you can find it
near the bottom of
this page .
Also, click here for a handy copy of the AdaBoost
algorithm.
Problem Sets
Click here to get to the homework page.