6.864: About 



Instructor:
Michael Collins, mcollins AT csail.mit.edu
Time & Location:
Tues & Thurs 12.30, 32144
Office Hours:
By appointment
TA:
Igor Malioutov, igorm AT csail.mit.edu
Course Description:
6.864 is a graduate introduction to natural language processing, the
study of human language from a computational perspective. We will
cover syntactic, semantic and discourse processing models. The
emphasis will be on machine learning or corpusbased methods and
algorithms. We will describe the use of these methods and models in
applications including syntactic parsing, information extraction,
statistical machine translation, dialogue systems, and summarization.
This subject qualifies as an Artificial Intelligence and Applications
concentration subject.
Problem sets:
There were will be 4 problem sets during the class, due roughly every
two weeks. The problem sets will include both theoretical problems and
some programming assignments.
Exams:
There will be a midterm and a final in the class.
Projects:
There will be a final project for the class, more details to follow.
Grading:
The overall grade will be determined roughly as follows:
Midterm 20%, Final 30%, Problem sets 25%, Final 25%.
Syllabus:
Here is a tentative syllabus for class:
 Introduction (1 lecture)
 Estimation techniques, and language modeling (1 lecture)
 Parsing and Syntax (4 lectures)
 Loglinear models (1 lecture)
 Stochastic tagging (1 lecture)
 Historybased models (1 lecture)
 The EM algorithm in NLP (2 lectures)
 Machine Translation (3 lectures)
 Global linear models (2 lectures)
 Discourse Processing: segmentation, anaphora resolution (2 lectures)
 Probabilistic similarity measures and clustering (1 lecture)
 Wordsense disambiguation (1 lecture)
 Information extraction (1 lecture)
 Unsupervised/semisupervised learning in NLP (1 lecture)
 Treeadjoining grammar, combinatory categorial grammars (2 lectures)
Readings:
Course readings will be available either on the web or inclass
handouts.