CS4706: Spoken Language Processing, Spring 2010

Time: Mon/Wed 2:40-3:55
Place: CEPSR 415

Professor Julia Hirschberg (Office Hours Tu 3-5)
julia@cs.columbia.edu, 212-939-7114

Teaching Assistant Bob Coyne (Office Hours Mo 12-2)
coyne@cs.columbia.edu, 212-939-7147

Announcements | Academic Integrity | Description
Readings | Resources | Requirements | Syllabus


This course introduces students to research in spoken language in computational linguistics, aka natural language processing (NLP). We will study the different `meanings' that can be conveyed by the way that speakers produce sentences, techniques for analyzing spoken language, methods of developing speech technologies such as text-to-speech systems and speech recognition systems, and applications of speech technologies in the real world, such as spoken dialogue systems.  NB: This course can be counted as a PhD elective in Advanced AI.  It is a requirement for the MS NLP Track.  There are no official prerequisites for this course except Data Structures or equivalent, and no prior knowledge of NLP will be assumed.


Weekly homework assignments and two longer homeworks/projects involving building a text-to-speech system and a speech recognition system from components we will provide; these projects can be done in pairs if you wish.  There will be no exams. Each student in the course is allowed a total of 5 late days on homeworks with no questions asked; after that, 10% per late day will be deducted from the homework grade, unless you have a note from your doctor.  Do not use these up early!  Save them for real emergencies. 

All students are required to have a Computer Science Account for this class. To sign up for one, go to the CRF website and then click on "Apply for an Account".  The Speech Lab is available for use in homeworks as needed on a signup basis.


Academic Integrity

Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you believe you are going to have trouble completing an assignment, please talk to Prof. Hirschberg or to Robert Coyne in advance of the due date.  Please see the university policy.

Required texts:

    Daniel Jurafsky and  James H. Martin Speech and Language Processing (second edition). Pearson: Prentice Hall. 2009.  See errata before you do each reading assignment.  There are some typos in algorithms.

    Keith Johnson. Acoustic & Auditory Phonetics (second edition). Blackwell.  2003.

    Other required readings are available online via links from this syllabus.


·         10% Class Participation

·         50% Homeworks

·         40% Course projects

Homework submission procedure is described HERE.


Lab Signup.

Sign-up to use the Linux computers in the Speech Lab. .



·         Praat - Praat resources

·         Help using ToBI - ToBI Annotation Environments

·         Text-to-Speech Links and more...

·         Text-to-Song synthesis





Reading  Assignments

 HW Due Dates and Other Assignments

Jan 20

It's not what you said, it's how you said it [pdf]



Jan 25

From Sounds to Language [pdf]

J&M 7.1-7.3, 7.5


Jan 27

Acoustics of Speech [pdf]

J&M 7.4; Johnson Ch. 1-2

HW1 due

Feb 1

Tools for Speech Analysis [pdf]

Praat tutorial 1

Class will be held in the Speech Lab.  Download Praat to your laptop if you have one.  Bring it to class with headphones if you have them.

Feb 3

Speech Generation Overview [pdf]

J&M 8 (pp. 249-50, 281-84); TTS-history; Historical examples

HW2 due

Feb 8

Building a TTS System [pdf]


Project 1 (TTS) assigned

Feb 15

Text Normalization [pdf]

J&M 8.1, Yarowsky97


Feb 17

Modeling Pronunciation [pdf]

J&M 8.2; Fackrell&Skut04, Ghoshaletal09

HW3 due

Feb 22

Prosody Modeling [pdf]

Hirschberg03, J&M 8.3.0-8.3.4, ToBI labeling conventions

Download and isten to all the ToBI examples; try to imitate them and decide what they mean.

Feb 24

Prosody Modeling 2



Mar 1

Predicting Prosody from Text [pdf]

J&M 8.3.4-8.3.7;

Project 1 due

Mar 3

Michael Collins' talk



Mar 8

Information Status: Focus and Given/New [pdf]

GBrown83, Prince92, Terken&Hirschberg93

HW4 due

Mar 10

Backend Synthesis and Evaluation [pdf]

J&M &M 8.4-5, 8.6 Tokuda35al02


Mar 15-19

Spring Break



Mar 22

ASR: Overview [pdf]

J&M 9-9.2, 6-6.3

HW5 due


Mar 24

Building an ASR System [pdf]

 J&M 9.3-9.7, Johnson Ch. 1-2 (review)

Fadi Biadsy

Project 2 (ASR) assigned

Mar 29

Language Modeling [pdf]

J&M 4, 9.5


Mar 31

ASR Evaluation [pdf]

J&M 9.8


Apr 5

Human Speech Perception [pdf]

J&M 10.7; Johnson 3-4

Apr 7

Metadata:  Speaker, Sentence  and Topic Segmentation and Disfluencies [pdf]

J&M 10.5, Liuetal04, Liuetal05, Snoveretal04


Apr 12

Dialect Identification [pdf]


Fadi Biadsy


Apr 14

Deception Detection


 Project 2 due

Apr 19

Spoken Dialogue: Human and Machine [pdf]

J&M 24-24.1, 24.8

Project 3 (SDS) assigned

Apr 21

SDS System Architectures [pdf]

J&M 24.2-3, Goldberg03

 Guest Lecturer:  Joshua Gordon

Apr 26

Turn-taking in SDS [pdf]



Apr 28

Dialogue Acts and Information State

J&M 24.5, Hirschbergetal04


May 3

SDS Evaluation [pdf]

J&M 24.4, Walkeretal97


May 4-6

Study Days



May 12

 Project Presentations


Project 3 due 




Julia Hirshberg Portrait

Julia Hirschberg
Professor, Computer Science

Columbia University
Department of Computer Science
1214 Amsterdam Avenue
M/C 0401
450 CS Building
New York, NY 10027

email: julia@cs.columbia.edu
phone: (212) 939-7114

Download CV


Columbia University Department of Computer Science / Fu Foundation School of Engineering & Applied Science
450 Computer Science Building / 1214 Amsterdam Avenue, Mailcode: 0401 / New York, New York 10027-7003
Tel: 1.212.939.7000 / Fax: 1.212.666.0140