John Hewitt
Email: jh5020@columbia.edu
Learning from, and learning to generate, natural language is one of the core strategies in modern artificial intelligence. Systems built from the tools learned in this class are increasingly deployed in the world. This section (Section 2) provides a generative models-focused introduction to this field of natural language processing, with the goal of understanding and implementing the foundational ideas beneath state-of-the-art systems.
Topics will include: language modeling, neural network design, text tokenization, web-scale text datasets, machine translation, summarization, accelerators like GPU and TPUs, linguistics and the structure of language, reinforcement learning, and many others.
For this class, it would be useful to be familiar with any of: linear algebra, python programming, probability, differential calculus. We've provided a set of notes for filling some gaps in preparation, Lecture Note 0 and here for PDF.
Lectures: Tuesdays and Thursdays, 2:40 PM – 3:55 PM
Location: TBD
Background Lecture Notes (PDF) | ||
---|---|---|
Week | Tuesday | Thursday |
1 | Introduction, Language Modeling | Tokenization |
2 | Background Review | Representation Learning 1 (Architectures) |
3 | Representation Learning 2 (Learning Algorithms) | Tasks and Evaluation |
4 | Exam 1 | Parallelization and GPUs |
5 | Parallelizable Architectures | Self-Attention and Transformers |
6 | Finetuning | Pretraining 1 |
7 | Generation Algorithms | Posttraining 1: Instruction Following |
8 | Posttraining 2: Reinforcement Learning | Exam 2 |
9 | Experimental Design | Retrieval and Tools |
10 | AI Safety | Bias, Fairness, Privacy |
11 | History of NLP | Debugging Language Models |
12 | Guest Lecture 1 | Interpretability and Analysis |
13 | Guest Lecture 2 | Looking to the future |
14 | Final Project Help | Final Project Help |
John: TBD
TA Office Hours: TBD