This is the website for the course entitled “Machine Learning” for the Fall 2025 semester.
Please see the CS Course Registration Policy. I am not managing the waitlists myself (even though it may appear as “Instructor Managed” on Vergil), and I will not be able to respond to questions about the waitlist or enrollment issues.
Under Construction
Below is the planned course schedule. (The date convention used is MM/DD; sorry.) The dates for the in-class exams are definitive. Everything else is subject to change, so this section is always “under construction”.
Jump to: Description, Learning goals, Prerequisites, Topics, Requirements, Disability services, Academic rules of conduct
COMS 4771 is a graduate-level introduction to machine learning. The course introduces basic statistical principles and algorithmic paradigms of supervised machine learning.
There are several prerequisites for this course.
Note: The list of prerequisites on Vergil and SIS is incorrect. In particular, COMS 3770 is not a substitute for any of the prerequisites listed above.
Machine learning is a confluence of ideas from many disciplines, including computer science, optimization, physics, and statistics. However, the common language of machine learning is rooted in the mathematical subjects of calculus, linear algebra, and probability. This language is used both to describe basic methods of machine learning, as well as to describe their underlying principles.
While many basic machine learning methods have been implemented in software packages, adapting these methods to new applications may require knowledge of their inner workings, and the ability to read, write, and reason about programs.
Despite the common language used in machine learning, the descriptions of the core methods, problems, and principles in textbooks, software manuals, research articles, and lecture slides/notes are quite varied and possibly even contradictory. Machine learning is a relatively young field and is constantly changing. Mathematical maturity is essential to make sense of this “wild west”.
Review notes for some of the prerequisites are available here.
Additional online resources for some course prerequisites are as follows.
If you find this material unfamiliar, you should not take COMS 4771.
The anticipated list of topics is as follows. The topics may not correspond one-to-one to lectures.
You are expected to attend lectures, complete reading and homework assignments, and complete in-class exams.
Lectures will be mostly self-contained; required reading assignments will be posted alongside the course schedule. Pointers to optional reading from (some of) the following texts will also be given.
All of these texts are available online, possibly through Columbia University Libraries.
Homework will be assigned throughout the semester. The purpose of these assignments is to help you learn the course material through practice and active engagement. (I suspect many students learn more effectively this way than via “passive learning” alone.)
The types of homework assignments may include: short online multiple-choice/short answer quizzes (on Gradescope), word problems, algorithm implementation and experimentation, data analysis.
Model solutions for most of the assignments will be provided so that students can evaluate their own solutions. Specific feedback from the course staff may be provided during office hours, or upon submission of solutions to Gradescope.
The three in-class exams will take place during the lecture on the following dates.
You must take all exams during the lecture times for the section in which you are registered.
The kinds of questions on the exams may be similar those from the homework assignments, but naturally adjusted (e.g., scaled down) for the format of a time-constrained in-class exam. You will not be asked to write any large amount of Python code, but you could be asked to write some short pseudocode or answer questions about small snippets of code.
The material covered by each exam is cumulative but emphasizes the material since the last exam.
Your final grade is based on the scores you earned for the in-class exams and homework assignments. Let Ei denote your score (out of 100) for in-class exam #i (for i ∈ {1, 2, 3}), and let H denote your total score (out of 100) for the homework assignments. Then your overall score (out of 100) is 0.36 × (E1 + E2 + E3) − 0.18 × min{E1, E2, E3} + 0.1 × H. (Your lowest in-class exam score is counted half as much as each of the others.)
As required by the university, your overall score will be discretized to determine your final letter grade (one of A+, A, A−, B+, B, B−, C+, C, C−, D, F). The discretization process will take into account the distribution of overall scores across all students in the class (i.e., the final grade is “curved”).
There are no “make-up” homework assignments or exams available. Do not enroll in the course if you do not expect to be able to take the in-exams at the scheduled times.
If you miss the deadline for submitting a homework assignment due to a medical or family emergency, then fill out the following form (as soon as possible) and it will be excused: https://forms.gle/WJXGQqDUNoQ3Zmtv7.
If you miss an exam due to a medical or family emergency, you may have the following options (subject to the rules of your degree program). You may be granted an “incomplete” for the course; the “incomplete” grade is removed after you complete a comparable exam in a future offering of this course (to be arranged with the pertinent instructors). Or, you may “withdraw” from the course, in which case you will receive a “W” grade instead of a standard letter grade for the course. Please consult with an academic advising staff member to determine which (if any) of these options are available to you.
If you require accommodations or support services from Disability Services, please make necessary arrangements in accordance with their policies within the first two weeks of the semester.
You must adhere to the Academic Honesty policy of the Computer Science Department, as well as the course-specific policies described below.
All exams must be completed individually. Collaboration or discussion between students on exams is not permitted. Use of abaci, calculators, phones, the internet, laptop computers, desktop computers, tablets, “smart” watches, AI tools, AR/VR goggles, etc. during exams is not permitted. Use of any items explicitly declared by the instructor to be unauthorized during exams is not permitted.
You are welcome to discuss homework with other students in the class, but any homework you submit must be your own and written-up by yourself in your own words. Any use of AI tools on homework must be explicitly declared.
Violation of any portion of these policies will result in a penalty to be assessed at the instructor’s discretion (e.g., a zero grade for the assignment in question, a failing letter grade for the course), even for a first offense.
TA office hours are, by default, held in the TA room on the first floor of Mudd.
As you exit the elevators on the first floor of Mudd, the couches will be in front of you. Turn right and you will come to a corridor: turn right again. The TA room is the first door on the left.
(The above information and image were adapted from an archive copy of https://ia.cs.columbia.edu/tamap.shtml.)