COMS 6998: Advanced Topics in Spoken Language Processing

Instructors: Julia Hirschberg

Time: Tu 4:10-6:00 (Spring 2020)

Location: 524 Mudd


Prerequisite: COMS 4705 or another speech or NLP class

Description:  This class will introduce students to spoken language processing:  basic concepts, analysis approaches, and applications.

Required readings:

Jurafsky & Martin 2019 chapters

These and other readings are linked from this syllabus for each class.


Keith Johnson. Acoustic & Auditory Phonetics (3rd edition). Wiley.  2011.



A list of resources can be found here.


Office Hours

Julia Hirschberg: Wednesdays, 3-4pm (CEPSR 705)

Jessica Yin Huynh: Thursdays, 4:10-5:10pm (CEPSR 7LW3)

Rebecca Calinsky: Mondays, 2-4pm (CEPSR 7LW3)

Grade Breakdown

10% attendance and participation

20% weekly posts

20% HW1

25% HW2

25% HW3



Academic Integrity

The SEAS academic integrity policy is found here.

The CS academic integrity policy is found here.


Note: Schedule and readings are subject to change






Week 1: 1/21

Introduction to Speech Processing


Week 2: 1/28

From Sounds to Language

Jurafsky & Martin Chapter 7 (sections 1-3)

Week 3: 2/4

Acoustics of Speech

Jurafsky & Martin Chapter 7 (sections 4-7)

Week 4: 2/11

Tools for Speech Analysis

Praat Tutorial(Chapter 11 - scripting - is optional)

Download Praat

HW1: Praat Recording and Analysis (assigned)

Week 5: 2/18

Analyzing Speech Prosody

ToBI Conventions

ToBI Tutorial

Prosody and Meaning

Week 6: 2/25

Text-to-Speech Synthesis (Guest speaker:  Rose Sloan)

Jurafsky & Martin Chapter 8

Merlin Tutorial

Tacotron: Towards end-to-end speech synthesis

Where do the improvements come from in sequence-to-sequence neural tts?

HW1 due

Week 7: 3/3

Spoken Dialogue Systems

Jurafsky & Martin Chapter 24

Jurafsky & Martin Chapter 25

HW2: Identifying Dialogue Acts (assigned)

Week 8: 3/10

Speech Recognition: Then and Now (Guest speaker: Fadi Biadsy)

Jurafsky & Martin Chapter 9

An Overview of End-to-End Automatic Speech Recognition



Week 9: 3/17

Spring Break: No classes



Week 10: 3/24

Speech Analysis: Entrainment in Spoken Language

Measuring acoustic-prosodic entrainment with respect to multiple levels and dimensions

Mark My Words! Linguistic Style Accommodation in Social Media

Week 11: 3/31

Speech Analysis: Personality and Mental State

Detecting late-life depression in Alzheimer's disease through analysis of speech and language

A Cross-modal Review of Indicators for Depression Detection Systems

Automatic Recognition of Personality in Conversation

HW2 due

Week 12: 4/7

Speech Analysis: Emotion and Sentiment Detection (Guest speaker: Zixiaofan Yang)

Classifying Subject Ratings of Emotional Speech Using Acoustic Features

Using Context to Improve Emotion Detection in Spoken Dialog Systems

Adieu features? end-to-end speech emotion recognition using a deep convolutional recurrent network

HW3: Emotional Speech Detection (assigned)

Week 13: 4/14

Speech Analysis: Sarcasm

Sarcastic or Not: Word Embeddings to Predict the Literal or Sarcastic Meaning of Words

"Sure, I did the right thing": A system for sarcasm detection in speech

"Yeah, right": Sarcasm recognition for spoken dialogue systems

The sound of sarcasm

Why can’t robots understand sarcasm?

Multimodal Indicators of Humor in Video

Week 14: 4/21

Speech Analysis: Deception and Trust (Guest speaker: Sarah Ita Levitan)

Linguistic Cues to Deception and Perceived Deception in Interview Dialogues

Lying Words: Predicting Deception from Linguistic Styles

Personality Factors in Human Deception Detection: Comparing Human to Machine Performance


Week 15: 4/28

Speech Analysis: Humor (Guest speaker: Zixiaofan Yang), Charisma, Likability and Style

Charisma perception from text and speech

"Would You Buy A Car From Me?"-- On the Likability of Telephone Voices

Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation

HW3 due