Grounding Emotions in Spoken Dialog Systems

Speaker Name:	Jackson Liscombe
Speaker Info:	Graduate Student, NLP Group; jaxin@cs.columbia.edu
Date:	Thursday October 14th
Time:	11:30am-12:30pm
Location:	CS Conference Room

Abstract:

In this talk I will present work from my summer internship at AT&T Shannon Labs. It is joint work with Giuseppe Riccardi and Dilek Hakkani-Tur.
The research entailed the exploration of the affective component of human-machine communication. In state-of-the-art spoken dialog systems this dimension is usually ignored though it plays a major role in engaging users in communicating with machines. Past research in the prediction of human emotion in spoken language has tended to exploit only isolated speech utterances (be they acted or spontaneous), using little or no contextual information. Over the summer we explored the dynamic nature of emotion as it is manifested in human-machine spoken communication. We augmented traditional emotion indicators, such as prosody and language, with contextual information. Contextual information consists of temporal variation of emotion indicators inter- and intra-utterance. We observed 1.2% relative improvement in the prediction of negative user state using dialog acts over lexical and prosodic information alone and 3.8% improvement using additional innovative contextual features. Overall performance accuracy using all features was 79%.