
TAs:
Daniel Bauer
(office hours Monday 45 and Friday 34:30 in the SpeechLab, 7LW3 CEPSR (Shapiro) building)
Sara Rosenthal
(office hours Tuesday 1.302.30, and Thursday from 34, in 726 CEPSR)
The Final exam will take place on Tuesday, 12/20/2011 at 4:10pm  7:00pm in 602 Hamilton Hall. All material from the semester may be covered, excluding the last lecture (on 12/8). The exam is closed book, closed electronic devices (except standard calculators). Students are allowed to bring one letter sized study sheet (double sided).
Nov. 5th: Homework 2 is now posted the homeworks page.
Reminder: the second quiz will be in class, on November 1st. All material up to and including the lecture on 10/27 will be covered. For some example questions on the IBM translation models, see questions 14 of this previous quiz.
Sept. 19th: Homework 1 is now posted on the homeworks page.
Reminder: the first quiz will be in class, on October 4th. All material up to and including the class on 9/29 will be covered. For some example questions, see questions 1, 9, 10 of this previous quiz (solutions here), and questions 1 and 4 of this previous quiz (other questions are on topics not yet covered in the class).
Prof. Collins's office hours on September 28th will be held 12pm, instead of the usual time of 45pm.
Some people are asking for clarification on problem 2 of homework 1. Here is a hint: take a look at slide 20 of the lecture slides on language models.
Date  Topic  References 
9/6  Introduction and Overview


9/8  Language Modeling  Notes on language modeling (Required reading: updated version posted Sep 15th, fixing a typo in the definition of missing mass on pages 11 and 12.). 
9/13  Tagging, and Hidden Markov Models  Notes on HMMs (Required reading) 
9/15  Tagging, and Hidden Markov Models (continued)  
9/20  Parsing, contextfree grammars, and probabilistic CFGs  Note on PCFGs (required reading) 
9/22  Parsing, contextfree grammars, and probabilistic CFGs (continued)  
9/27  Parsing, contextfree grammars, and probabilistic CFGs (continued)  
9/29  Lexicalized probabilistic CFGs  
10/4  Quiz 1  
10/6  Guest lecture by Nizar Habash  
10/11  Lexicalized probabilistic CFGs (continued, see previous slides)  
10/13  Machine translation part 1  (Note: we didn't cover the final section, on evaluation using BLEU, but I've kept it in the slides in case it's of interest.) 
10/18  Machine translation part 2  Note on IBM Models 1 and 2 (required reading) 
10/20  Machine translation part 2 (continued)  
10/27  Phrasebased translation models 
Note on phrasebased models (required reading: updated on 30th October to include details of learning a phrasebased lexicon from training examples (sections 1 and 2)) Slides from the tutorial by Philipp Koehn 
11/3  Reordering for statistical MT  
11/10  Loglinear models 
Note on loglinear models (required reading).

11/15  Loglinear tagging (MEMMs)  
11/17  Global linear models  
11/22  Global linear models (continued)  
11/29  Global linear models part II  
12/1 
Global linear models part III, Wordsense disambiguation  
12/6  The Brown wordclustering algorithm  
12/8  The EM algorithm for HMMs 