I am a 5th year PhD student in the Department of Computer Science at Columbia University. I am part of the Spoken Language Processing Group, directed by Dr. Julia Hirschberg. My current research involves identifying spoken cues to deception, and examining cultural and gender differences in the way that people communicate and perceive lies. I am currently funded through an NSF-GRFP fellowship and I am an IGERT fellow.
Last semester I co-taught a course called Computational Models of Speech and Language, along with Dr. Michelle Levine.
The goal of this work is to automatically detect deception using acoustic-prosodic and lexical-syntactic cues. We are interested in exploring the factors that play a role in deception and deception detection, such as culture, gender, and personality. Toward that end, we have collected a large corpus of deceptive and non-deceptive speech, comprised of conversations between adult native speakers of American English and of Mandarin Chinese. We are applying machine learning techniques to automatically identify deceptive statements, and exploring individual differences between cultures, genders, and personalities in deceptive behavior.
Collaborators: Julia Hirschberg, Andrew Rosenberg, Michelle Levine, Guozhen An
Automatic identification of speaker traits such as gender, age and emotional state from speech is an important problem for personalized speech-driven services. In this work, we present a novel approach that leverages pitch feature trajectories with the goal of identifying the speaker’s gender with as little speech as possible.
We use the f0 (fundamental frequency) trajectory, the most discriminative feature between male and female speech, but instead of computing summary statistics of the f0 trajectory, we use the entire trajectory as input to the classifier. We model these trajectories as “text” input with each token corresponding to the binned f0 value. Our results show that the trajectory approach can be useful for obtaining fairly accurate gender predictions with as little as one second of speech.
Collaborators: Taniya Mishra, Srinivas Bangalore
In conversation, people tend to become similar to their dialogue partner by adopting lexical, acoustic, prosodic, and syntactic characteristics of the interlocutor’s speech. Research shows that this phenomenon, known as entrainment, is associated with task success and dialogue quality. We studied entrainment patterns in the Supreme Court corpus, and examined relationships between trial success and entrainment between lawyers and justices. We used Amazon Mechanical Turk to preprocess the data and excise noisy areas in the audio files that skew the analysis process. We found that lawyers entrain more than justices, supporting the theory that the less dominant interlocutor is more likely to entrain to the more dominant speaker.
Collaborators: Julia Hirschberg, Rivka Levitan