CS 4705: Introduction to Natural Language Processing, Fall 2007 | |||
|
Time: |
TTh: 2:40-3:55 |
Place |
|
|
Professor: |
Office
Hours: |
TBA | |
|
Email: |
julia@cs.columbia.edu |
Phone: |
212-939-7114 |
|
Teaching
Assistant: |
Office
Hours: |
TBA | |
|
Email: |
frankl@cs.columbia.edu |
Phone: |
212-939- |
Announcements
|| Academic
Integrity || Contributions
|| Description
Links
to Resources || Requirements || Syllabus || Text
This course provides an introduction to the field of computational linguistics, aka natural language processing (NLP). We will learn how to create systems that can understand and produce language, for applications such as information extraction, machine translation, automatic summarization, question-answering, and interactive dialogue systems. The course will cover linguistic (knowledge-based) and statistical approaches to language processing in the three major subfields of NLP: syntax (language structures), semantics (language meaning), and pragmatics/discourse (the interpretation of language in context). Homework assignments will reflect research problems computational linguists currently work on, including analyzing and extracting information from large online corpora.
Speech and Language Processing by Jurafsky and Martin. It will be available from the University Bookstore, as well as from Amazon and other online providers. It should also be on reserve in the Engineering Library. Please check the online errata for the text for each chapter as you read it.
Three homework assignments, a midterm and a final exam. Each student in the course is allowed a total of 4 late days on homeworks with no questions asked; after that, 10% per late day will be deducted from the homework grade, unless you have a note from your doctor. Do not use these up early! Save them for real emergencies. Homeworks are due by midnight on the due date.
All students are required to have a Computer Science Account for this class. To sign up for one, go to the CRF website and then click on "Apply for an Account".
Homework submission procedure.
Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you believe you are going to have trouble completing an assignment, please talk to the instructor or TA in advance of the due date.
|
Week |
Class |
Topic |
Reading |
Assignments |
|
1 |
Sep 4 |
|
| |
|
|
Sep 6 |
Natural
Language and Formal Language: Regular Expressions and Finite State
Automata |
Ch
1-2 |
|
|
2 |
Sep 11 |
Ch 3.1 |
| |
|
|
Sep 13 |
Word Construction and Analysis: Morphological Parsing |
Ch 3: 2-6
|
|
|
3 |
Sep 18 |
Ch 5:1-8 |
||
|
|
Sep 20 |
Ch 6 |
| |
|
4 |
Sep 25 |
Ch 8 |
| |
|
|
Sep 27 |
| ||
|
5 |
Oct 2 |
Ch 9 |
||
|
|
Oct 4 |
Ch 10 |
||
|
|
Oct 9 |
Ch 12 ( |
| |
|
|
Oct 11 |
Representing Meaning |
Ch 14 |
|
|
|
Oct 16 |
Ch 15:1,4-6 |
| |
|
|
Oct 18 |
Midterm Examination | ||
| Oct 23 | Relations Among Words | Ch 16:1-2 | ||
| Oct 25 | Roles Words Can Play | Ch 16:3-5 | ||
|
|
Oct 30 |
Ch 17:1-2 |
| |
|
|
Nov 1 |
Ch 17:3-5 |
| |
| 10 |
Nov 6 |
Holiday |
Holiday |
Holiday |
| Nov 8 | Pronouns and Reference Resolution | Ch 18: 18.1 | ||
| Nov 13 | Algorithms for Reference Resolution | |||
| Nov 15 | Text Coherence and Discourse Structure |
Ch 18.2-18.5; Grosz&Sidner86 |
||
|
|
Nov 20 |
Turn-taking, Grounding and Dialogue | Ch 19:1 |
|
| Nov 22 | ||||
|
13 |
Nov 27 |
Ch19:2-6 |
| |
|
|
Nov 29 |
Natural Language Generation |
Ch 20 |
|
|
14 |
Dec 4 |
Ch 21 |
||
|
|
Dec 6 |
Spoken Language Processing and Final Review |
|
|
|
Dec. 11-13 |
|
|
Study Days |
| Dec. 14-21 | Final Exams |
Places to look up definitions and descriptions of terminology:
Try out one of the many versions of Eliza on the web.
AT&T Labs -
Announcements || Academic
Integrity
||
Contributions
|| Description
Links
to Resources|| Requirements
|| Syllabus ||
Text