COMS 6975: Large Language Model Interpretability and Alignment

Columbia Logo

Instructor: John Hewitt — Columbia University — Department of Computer Science

Course Description

The use of large language models pervades artificial intelligence research and, increasingly, our world. This PhD seminar in natural language processing is intended to bring together students in the shared goal of synthesizing, critiquing, and extending recent (and not so recent) research in this subfield.

The first three weeks will consist of an intensive review of the mathematics and technical aspects of large language models---their architecture, pretraining, and alignment---as well as attempts to understand them. Then, students will present research papers to the rest of the class, which will jointly assess, critique, and extend those papers.

Prerequisites

There are no formal prerequisites, though having taken COMS 4705 (natural language processing) would be useful. I am admitting PhD students primarily, though I will send out an interest form to the rest of the wait list to gauge research interest for the rest of the slots in the class.

Schedule

There are no formal prerequisites, though having taken COMS 4705 (natural language processing) would be useful. I am admitting PhD students primarily, though I will send out an interest form to the rest of the wait list to gauge research interest for the rest of the slots in the class.

Lectures: Fridays, 13:10PM-2:00PM
Location: tbd

We'll use Ed for discussion forums, and Gradescope for assignment submission. You should have been added automatically to both. If you just enrolled, ping us to sync the Canvas roster.

Grading

This grading breakdown is provisional and subject to change.

Letter grades will be determined by the teaching staff as a function of the following breakdown; cutoffs for each letter grade will be decided at the end of the class, not by pre-set cutoffs. All written elements of the assignments, as well as the final project writeups, must be written in LaTeX and submitted as PDF. Students are allowed to use AI tools in whatever capacity they desire. The content students submit is their responsability alone.

Materials and Expectations

This course has no required textbook; I will provide lecture notes for the fundamentals, and then we will be reading and presenting papers.

Attendance is required. In general, I expect you to be at effectively every lecture. However, I dislike grading on attendance, so there's no penalty for not attending, and I understand that everyone will need to miss a lecture or two.

Please see the grading section for our policies on AI tools in this class. Otherwise, please refer to the Faculty Statement on Academic Integrity and the Columbia University Undergraduate Guide to Academic Integrity.

The teaching team is committed to accomodating students with disabilities in line with the Faculty Statement on Disability Accommodations.