General Information | Introduction | Prerequisites | Syllabus | Grading | Readings | Problems | Final Projects | Scribe Notes | Academic Honesty Policy
Instructor: Rocco Servedio.
Office hours: Fri 9:30-11:30, CSB 517.
Teaching assistant: Clement Canonne (OH: Tues 2-4, CSB 516)
Classroom: 644 Mudd
Time: Wed 4:10-6pm.
Course website (this page): http://www.cs.columbia.edu/~rocco/Teaching/S14/
Course email (use for all course-related issues): coms6998learntests14 at gmail dot com.
Administrative Note: This course can be used as a theory elective for the Ph.D. program in computer science, as a track elective course for MS students in the "Foundations of Computer Science" track or the "Machine Learning" track, or as a track elective for undergraduates in the "Foundations" track. Auditing the class requires the approval of the instructor.
In this course we will study sub-linear time algorithms which are aimed at helping us understand massive objects. The algorithms we study inspect only a tiny portion of an unknown object and have the goal of coming up with some useful information about the object. Algorithms of this sort provide a foundation for principled analysis of truly massive data sets and data objects.
In particular we will consider
We will study these algorithms for three different types of objects: Boolean functions, graphs, and probability distributions.
There are no specific prerequisities for the course, but you should be mathematically mature and comfortable with the usual style of theoretical computer science arguments and analyses. Prior graduate level coursework in theoretical computer science will be helpful, but the course should be accessible to mathematically strong undergraduates as well. It may be helpful to have taken COMS 4252 (the introductory learning theory course), but this is not a requirement -- this course has been designed to be accessible without COMS 4252.
In a bit more detail, we will freely use standard tail bounds for sums of independent random variables (i.e. Chernoff bounds and Hoeffding bounds) so it may be helpful to come into the course familiar with these basic tools. Here is a quick summary of these results that has been helpfully prepared by Clement.
Scribing lectures: For each lecture there will be one or more assigned scribes. Your duties as a scribe are to take careful notes which will be made available on the class web page. Your notes should be a clear, detailed and readable exposition of the material covered in that lecture. You are encouraged to use outside sources (in particular, relevant papers) to make your notes as good as possible. A template will be provided.
Homework problem: As the semester progresses I will designate (sometimes in class) certain questions/lemmas as "official problems." These will either be problems relating to our course material, or lemmas that I will state but not prove in class. You are to choose one of these "official problems" at some point in the semester and turn in a clear, detailed, self-contained solution to the problem by the last day of class, April 30. Click here for a list of the "official problems".
The project is your chance to become an expert on some particular learning / testing / sublinear-time algorithms problem that interests you (and is aligned with the flavor of the course), and hopefully to contribute to what is known about this topic. You will turn in a project proposal fairly early in the semester to identify the general area of your project, and ultimately a final report as described below.
There are two aspects to the final project. The first is a literature review: this will involve identifying the relevant literature on your topic (typically one or two research papers), closely reading and understanding this literature, and writing up a presentation of the results you have learned in your own words. You are encouraged to use the scribe notes template for this portion of your project.
The second aspect of the project is to identify a specific research question or direction on your chosen topic, think about it, and write up the results of your research. Ideally, your project will result in some research progress that could lead to a new publishable research result in the topic you pursue, but this is not a course requirement. All that you need to do for this portion of the project to be successful is come up with a research question/direction and give evidence of having made a real attempt to attack it.
So to summarize the project requirements, your final project submission should (1) give a clear, self-contained exposition of what you have learned about your chosen topic -- as mentioned above, the class scribe notes format is fine for this portion; and (2) it should also explain the specific research problem you investigated, what you did to try to solve this problem, and what progress you made.
Final projects will be due on Wed May 7; more details are available on the projects page, which lists some possible project topics.
There is no required textbook for the course. Pointers to required and recommended readings (papers and occasional book chapters) will be posted on the ``Syllabus'' portion of this webpage above as the semester progresses.
for more information about final projects.
Click here to obtain scribe notes from the various lectures.
Academic Honesty Policy Students are expected to adhere to the Academic Honesty policy of the Computer Science Department; this policy can be found in full here. Please contact the instructor with any questions.