MACHINE LEARNING

MACHINE LEARNING January 1, 2013

COMS4771-001 COURSE INFO

Time & Location	Tue / Thu 1:10pm-2:25pm at HAMILTON HAL 702
Instructors	Adrian Weller, adrian(at)cs(dot)columbia(dot)edu & Ilia Vovsha, iv2121(at)columbia(dot)edu
Office Hours	Tue & Thu 2:30-3:15pm at CEPSR 6LE5 (Adrian) Fri 1:30-3:00pm at CEPSR 6LE5 (Ilia)
TAs	Xu Tan, ttanxu(at)gmail(dot)com Peng Jiang, pj2243(at)columbia(dot)edu Ran Yu, ry2239(at)columbia(dot)edu
Bulletin Board	Available via courseworks.columbia.edu and is the best way to contact the Professor and the TAs. Use it for clarifications on lectures, questions about homework, etc.

Prerequisites: Background in calculus, linear algebra, and statistics.
Programming ability in some (any) language.

Description:
The course introduces various topics in machine learning. Material will include:
Baeysian inference & decision theory, Gaussian and exponential family distributions, 
maximum likelihood, least squares, linear regression, linear classification, 
neural networks, statistical learning theory, support vector machines, kernel methods,   
mixture models, the EM algorithm, graphical models, and hidden Markov models.
Students are expected to implement several algorithms in Matlab, and have some 
background in calculus, linear algebra, and statistics.

Recommended Texts:

The following three books are highly recommended.
Specifically, the Duda & Hart text is a very gentle introduction to many of the topics that will be covered in the first part of the course.
The Bishop book is a slightly more advanced discussion of many topics in machine learning.
The Jordan & Bishop text is very good on graphical models, which will be covered in the second half of the course.

Michael I. Jordan and Christopher M. Bishop, Introduction to Graphical Models.

Still unpublished. Available online (password-protected) on class home page.

Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer.

2006 First Edition is preferred. ISBN: 0387310738. 2006.

R.O. Duda, P.E. Hart and D.G. Stork, Pattern Classification, John Wiley & Sons, 2001.

Optional Texts: Available at library (additional handouts and pointers to useful sites will also be provided).

V. Vapnik, Statistical Learning Theory, Wiley-Interscience, 1998.

Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning,
Springer-Verlag New York USA, 2009. 2nd Edition. ISBN 0387848576.

D. Mackay, Information Theory, Inference and Learning Algorithms,
Cambridge University Press, 2003, available to download online.

Graded Work: Grades will be based on homeworks (40%), the midterm (around 25%),

and the final exam (around 35%). Any material covered in

assigned readings, handouts, homeworks, solutions, or lectures may appear in exams. Your worst homework will not count towards your grade.

If you miss the midterm and don't have an official reason, you will get 0 on it.

If you have an official reason, your midterm grade will be based on the final exam.

Tentative Schedule:

Date	Topic
January 22	Lecture 01: Introduction
January 24	Lecture 02: Basic Statistics
January 29	Lecture 03: Parametric Statistical Inference
January 31	Lecture 04: Parametric Statistical Inference
February 5	Lecture 05: Cross Validation & Parametric Paradigm
February 7	Lecture 06: Perceptron
February 12	Lecture 07: Neural Networks & BackProp
February 14	Lecture 08: Statistical Learning Theory (intro)
February 19	Lecture 09: Statistical Learning Theory (capacity)
February 21	Lecture 10: Statistical Learning Theory (bounds)
February 26	Lecture 11: VC Dimension
February 28	Lecture 12: Support Vector Machines
March 5	Lecture 13: Kernels
March 7	Lecture 14: Dimensionality Reduction
March 12	Lecture 15: Clustering
March 14	MIDTERM EXAM
March 19	Spring Recess (NO CLASS)
March 21	Spring Recess (NO CLASS)
March 26	Lecture 16: Mixtures of Gaussians, Latent variables, EM intro
March 28	Lecture 17: EM in more details
April 2	Lecture 18: Graphical Models...
April 4	Lecture :
April 9	Lecture :
April 11	Lecture :
April 16	Lecture :
April 18	Lecture :
April 23	Lecture :
April 25	Lecture :
April 30	Lecture :
May 2	Lecture :

Class Attendance: You are responsible for all material presented in the class

lectures, recitations, and so forth. Some material will diverge from the textbooks

so regular attendance is important.

Late Policy: If you hand in late work without approval of the instructor or TAs,

you will receive zero credit. Homework is due at the beginning of class on the

due date.

Cooperation on Homework:
You are encouraged to discuss HW problems with each other in small groups (2-3 people),
but you must list your discussion partners on your submission.
Solutions (code) must be written independently, sharing or copying of solutions is not allowed.
Of course, no cooperation is allowed during exams.

This policy will be strictly enforced.

Discussion of Course Material: See note at top of this page on the Bulletin Board.
We have many interesting topics to cover, and many of you will have good questions.
Please try to post questions or ideas to the bulletin board on Courseworks so that everyone can participate.

Web Page: The class URL is: http://www.cs.columbia.edu/~coms4771 and

will contain copies of class notes, news updates and other information.

Computer Accounts: You will need an ACIS computer account for email, use

of Matlab (Windows, Unix or Mac version) and so forth.