**Time**: Mon/Wed 5:40 PM–6:55 PM**Venue**: 501 Schermerhorn**Instructor**: Daniel Hsu**Links**: syllabus, announcements, homework, schedule, office hours**More links**: Piazza (sign-up link), Courseworks

**If you have questions about course logistics, course material, etc., please ask them (at appropriate times) during lecture or during office hours, or post them on Piazza.**

On Homework 3 Problem 4, you should also apply the *standardization* transformation to the data in `hw3data.mat`

, and also use \(C := 10/n\) (as opposed to \(0.1/n\)).

Homework 3 was posted; it is due Sunday 3/4. You can work in groups of up to three people. (The group may be different from the one you participated in for previous homework assignments.) Please follow the instructions detailed on the problem sheet.

The first exam is coming up in a few weeks (Wednesday 3/7 during lecture). The exam covers course material from lectures and reading (including the handouts) up to and including the neural networks lecture, as well as material in Homeworks 0–3. (For the reading, the focus will be on the material that was also covered in lecture and in the handouts.) I have posted several practice problems to help you prepare for the exam.

Homework 2 was posted; it is due Monday 2/19. You can work in groups of up to three people. (The group may be different from the one you participated in for Homework 1.) Please follow the instructions detailed on the problem sheet.

I have posted handouts on Perceptron and online-to-batch that reviews parts of Monday’s lecture in more detail.

Homework 1 was posted; it is due Monday 2/5. You can work in groups of up to three people. Please follow the instructions detailed on the problem sheet.

I have posted a handout on linear regression that reviews parts of today’s lecture.

Homework 0 was posted; it is due Monday 1/22. All homework assignments should be submitted as neatly typeset (not scanned or handwritten) PDF documents.

For Problem 5, it is fine to round your answer to, say, three decimal places (e.g., \(0.905\) or \(90.5\%\)).

This week I will not have office hours on Wednesday. Instead, I will hold office hours on Friday 1/19 from 11:00 AM-noon.

I have posted a handout on linear algebra that reviews the main concepts from linear algebra you should know.

If you are blocked from the waiting list in SSOL, you may instead use this survey provided by the DSI to get on the list. The survey questions (including those about previous coursework) are for DSI-internal purposes only; the actual prerequisites are listed on the syllabus, and the previous announcement about “Homework 0” still applies.

All students who register for this course will be placed on a waiting list. Students in Data Science Institute (DSI) academic programs will be manually added to the course after they appear on the waiting list.

This course is essentially the same as COMS 4771 (of which there are two sections in Spring 2018). Enrollment for all (non-DSI) students in either COMS 4721 or COMS 4771 will be based primarily on a “Homework 0” assignment distributed at the beginning of the semester.

The goal of Homework 0 is to review and assess prerequisites, and to introduce the homework submission process. It is required for *all* students (whether on the waiting list or not), and it must be submitted by the stated deadline. Late submissions are not accepted.

- HW0 (.pdf, .md, .tex), due 1/22. (solutions)
- HW1 (.pdf, .md, .tex), due 2/5. (solutions)
- HW2 (.pdf, .md, .tex), due 2/19. (solutions)
- HW3 (.pdf, .md, .tex), due 3/4.

- Overview (1/17)
- Reading: CML 1.1-1.2.

- Predictions (1/17, 1/22)
- Reading: CML 9.1-9.2; coin tosses handout.

- Linear regression (1/22, 1/24)
- Reading: ESL 3.1-3.2 (3.2.3–3.2.4 are optional); linear regression handout.

- Decision trees (1/29)
- Reading: CML 1.3-1.6, 2.1-2.7.

- Model selection (1/31)
- Logistic regression and linear classifiers (1/31, 2/5)
- Reading: CML 4.1-4.7; 5.1, 5.3-5.4; ESL 4.4-4.4.2; perceptron handout; online-to-batch handout.

- Support vector machines (2/7, 2/12)
- Reading: CML 7.7, 11.1-11.2, 11.4-11.6.

- Convex optimization (2/12)
- Optimization algorithms (2/14)
- Reading: CML 7.4, 14.2; CO 9.2-9.3; stochastic gradient tricks.

- Neural networks (2/19)
- Reading: CML 10.1-10.5; efficient backprop; multi-layer networks handout.

- Classification objectives (2/21)
- Reading: CML 5.5, 6.1-6.2, 8.1; one-against-all handout.

- Fairness
- Reading: CML 8.4; how big data is unfair.

- Learning theory
- Ensemble methods

Topics may include inference in simple Bayesian networks, discriminant analysis, factor analysis, mixture models, and hidden Markov models.

**Note**: Probabilistic modeling is the topic of another course at Columbia taught by Dave Blei. In COMS 4721, we’ll just cover some highlights of this rich and fascinating subject.

Topics may include multi-armed bandits, inverse propensity weighting, \(\epsilon\)-greedy algorithm.

- linear algebra
- coin tosses
- linear regression
- perceptron
- online-to-batch
- multi-layer networks
- one-against-all

Wed 2:30-4:30 PM in 426 Mudd.

(Note that office hours are *not* held in the hour immediately after lecture.)

All course assistant office hours are held in the TA room.

Che Shen | Ben Lai | Ben Lai | Connor Hargus | John Lee |
---|---|---|---|---|

Mon 3-5 PM | Tue 4-6 PM | Thu 4-5 PM | Thu 5-7 PM | Fri 4-6 PM |