COMS 4995-1 Spring 2020 Syllabus


COMS 4995-1 (“Machine Learning Theory”) is a graduate-level course on the theoretical study of algorithms for machine learning and high-dimensional data analysis. Topics include high-dimensional probability, theory of generalization and statistical learning, online learning and optimization, and spectral analysis.

In the future, this course will be called “COMS 4773”. You should be able to treat this course as a course with that number for a requirement within a CS degree program. For instance, if you have a requirement that is fulfilled by a course of the form “COMS 47xx”, then this course should count.

Basic information

  • Time: Mon/Wed 8:40am–9:55am
  • Venue: 417 Mathematics Building
  • Instructor: Daniel Hsu (office hours: Monday 10:00am–11:30am in 426 Mudd; also by appointment)
  • Teaching/course assistants:

Learning goals

  • Rigorously analyze machine learning problems and algorithms.
  • Read and understand research papers in machine learning theory.

In this course, we won’t really discuss practical aspects of machine learning. For practice-oriented course in machine learning, see COMS 4771.


You should have a basic level of mathematical maturity and be comfortable reading and writing mathematical proofs. We’ll use a fair amount of probability and linear algebra; there’ll also be a bit of convex analysis.

You should have taken a course in machine learning (e.g., COMS 4771), or should be willing to pick up the gist of that material as we go along. This is primarily useful to understand the motivation for problems and methods we discuss in the course.

You should have taken some “advanced” (proof-based) course in mathematics, theoretical statistics, or theoretical computer science (e.g., COMS 3261, CSOR 4231).

A previous course in learning theory (e.g., COMS 4252) is not required!

If you have concerns about whether the course is suitable for you, please contact the instructor.


We plan to cover topics in generalization theory, online learning and optimization, and high-dimensional data analysis.

Topics may include:

  • Concentration of measure
  • Statistical learning
  • Empirical process theory
  • Boosting and margins
  • Convex surrogates and optimization
  • Approximation theory
  • Online learning with experts
  • Online classification
  • Online convex optimization
  • Multi-armed bandits
  • Gaussian concentration
  • Best-fit subspaces
  • Spectral clustering

Note that there is some overlap with COMS 42521.

Related courses on machine learning theory at other institutions:


Readings will be assigned from various sources, primarily the following texts:

  • Understanding Machine Learning (UML) by Shalev-Shwartz and Ben-David
  • Online Learning and Online Convex Optimization (OLOCO) by Shalev-Shwartz
  • Foundations of Data Science (FDS) by Blum, Hopcroft, and Kannan

Other recommended texts are:

  • A Probabilistic Theory of Pattern Recognition (PTPR) by Devroye, Györfi, and Lugosi
  • An Introduction to Computational Learning Theory (ICLT) by Kearns and Vazirani
  • Foundations of Machine Learning (FML) by Mohri, Rostamizadeh, Talwalker

Course requirements

  • Complete reading assignments.
  • Homework assignments (75% of overall grade).
  • Project (25% of overall grade).

All assignments must be submitted as PDF documents compiled using TeX, LaTeX, or similar systems with bibliographic references (e.g., using BibTeX) included as necessary.

If you have not used LaTeX before, or if you only have a passing familiarity with it, it is highly recommended that you read and complete the lessons and exercises in The Bates LaTeX Manual.

Disability services

If you require accommodations or support services from Disability Services, please make necessary arrangements in accordance with their policies within the first two weeks of the semester.

Academic rules of conduct

You are expected to adhere to the Academic Honesty policy of the Computer Science Department, as well as the following course-specific policies.

Collaboration on homework assignments

You are welcome and encouraged to discuss homework assignments with fellow students, subject to the following rules.

  • You must first make a serious effort to solve the problem yourself before discussing with another student.
  • Discussion about a homework problem can only take place with at most two other students, and only with students who have not already solved the problem.
  • Discussion about a homework problem may include brainstorming and verbally discussing possible solution approaches, but must not go as far as one person telling others how to solve the problem.
    • You may not take any notes (whether handwritten or typeset) from the discussions.
    • Written “discussions” (e.g., over messaging platforms, email) are not allowed.
  • You may not look at another student’s homework write-up/solutions (whether partial or complete).
    • You may not show your homework write-up/solutions (whether partial or complete) to another student.
  • You must write-up your solutions by yourself.
  • You must list all discussants in your homework write-up.

Use of outside references on homework assignments

Outside reference materials and sources (i.e., texts and sources beyond the assigned reading materials for the course) may be used on homework assignments only if given explicit written permission from the instructor and if the following rules are followed.

  • Any outside reference must be acknowledged and cited in the homework write-up.
  • Sources obtained by searching the literature/internet for answers or hints on homework assignments are never permitted.
  • You are permitted to use texts and sources on course prerequisites (e.g., a linear algebra textbook).
    • If you need to look up a result in such a source, provide a citation in your homework write-up.
  • If you inadvertently come across a solution to (or substantial hint about) a homework problem, you must:
    • acknowledge this source and document the circumstance in your homework write-up;
    • produce a solution without looking at the source; and
    • as always, write your solution in your own words.
  • If you have already seen one of the homework problems before (e.g., in a different course), please re-solve the problem without referring to any previous solutions.
    • In your write-up, please also indicate that you had seen the problem before. (You won’t lose any credit for this; it would just be helpful for us to know about this fact.)


Violation of any portion of these policies will result in a penalty to be assessed at the instructor’s discretion (e.g., a zero grade for the assignment in question, a failing letter grade for the course). All violations are reported to the relevant dean’s office.

Getting help

You are encouraged to use office hours and Piazza to discuss and ask questions about course material and reading assignments, and to ask for high-level clarification on and possible approaches to homework problems. If you need to ask a detailed question specific to your solution, please do so on Piazza and mark the post as “private” so only the instructors can see it.

Questions, of course, are also welcome during lecture. If something is not clear to you during lecture, there is a chance it may also not be clear to other students. So please raise your hand to ask for clarification during lecture. Some questions may need to be handled “off-line”; we’ll do our best to handle these questions in office hours or on Piazza.

  1. COMS 4252 is a computational complexity-theoretic approach to learning, with polynomial-time learnability as its primary focus. While computational complexity will be a major concern in this course, we will balance it with other considerations, such as statistical modeling and efficiency.↩︎