COMS W4705: Natural Language Processing

[Main] | [General Information] | [Problem Sets]



Instructor: Michael Collins
Time & Location: Tues & Thurs 4.10-5.25, 535 Mudd
Office Hours:TBD

TAs: Please send all question to nlpfall2013.columbia at gmail dot com
Hyungtae Kim [hk2561] (OH Thursday 1.00-2.30 PM, in 7LE5 CEPSR NLP Lab)
Mohammad Sadegh Rasooli [mr3254] (OH Tuesday 1.45-3.15 PM, in 7LE5 CEPSR NLP Lab)
Victor Soto [vs2411] (OH Monday 4.00-5.30 PM, in 7LE5 CEPSR NLP Lab)
Yanting Zhao [yz2487] (OH Wednesday 4.00-5.30 PM, in 7LE5 CEPSR NLP Lab)

Announcements:

This year we will be using Piazza to have open discussions on the lectures and homeworks. Please sign up here.

A substantial portion of this class was offered on Coursera in Spring 2013. You may want to sign up at Coursera so that you can view video lectures for the topics in this class at this link. The video lectures will follow the content of the class very closely.


Lectures:
Date Topic References
9/3 Introduction and Overview
9/5 Language Modeling Notes on language modeling (required reading)
9/10 Tagging, and Hidden Markov Models Notes on HMMs (required reading)
9/12 Tagging, and Hidden Markov Models (continued)
9/17 Parsing, and Context-free Grammars Note on PCFGs (required reading)
9/19 Parsing, context-free grammars, and probabilistic CFGs (continued)
9/24 Weaknesses of PCFGs, Lexicalized probabilistic CFGs Note on Lexicalized PCFGs (required reading)
9/26 Models 1 and 2 from Collins, 1999 (additional material on lexicalized PCFGs)




Lectures from Fall 2012 (this year will follow a similar set of lectures to Fall 2012, though there may be some changes):
Date Topic References
10/1 Guest lecture by Nizar Habash
10/3 Machine translation part 1 (Note: we didn't cover the final section, on evaluation using BLEU, but I've kept it in the slides in case it's of interest.)
10/8 Machine translation part 2 Note on IBM Models 1 and 2 (required reading)
10/10 Phrase-based translation models Note on phrase-based models (required reading)
Slides from the tutorial by Philipp Koehn
10/15 Phrase-based translation models: the decoding algorithm
10/17 Mid-term (in class)
10/22 Reordering for statistical MT
10/24 Log-linear models Note on log-linear models (required reading).
10/31 Log-linear tagging (MEMMs)
11/7 Global linear models
11/12 Global linear models part II
11/14 Global linear models part III
11/19 Guest lecture: Joint Decoding Tutorial on dual decomposition
11/26 The Brown word-clustering algorithm
11/28 Semi-supervised learning for word-sense disambiguation, and cotraining for named-entity detection
12/2 The EM algorithm for Naive Bayes Notes on the EM algorithm for Naive Bayes (Sections 4 and 6 provide useful technical background, but can be safely skipped.)