I am a Postdoctoral Research Scientist in the Department of Computer Science at Columbia University. I am part of the Spoken Language Processing Group, directed by Dr. Julia Hirschberg.
I am currently working on identifying acoustic-prosodic and linguistic indicators of trustworthy speech, and also identifying linguistic characteristics of trustworthy news.
Previously, I worked on identifying spoken cues to deception, and examining cultural and gender differences in the way that people communicate and perceive lies. I defended my dissertation, Deception in Spoken Dialogue: Classification and Individual Differences, in January 2019. My PhD was funded through an NSF-GRFP fellowship and an NSF IGERT fellowship.
This semester I am co-teaching Advanced Topics in Spoken Language Processing together with Prof. Julia Hirschberg.
The goal of this work is to automatically detect deception using acoustic-prosodic and lexical-syntactic cues. We are interested in exploring the factors that play a role in deception and deception detection, such as culture, gender, and personality. Toward that end, we have collected a large corpus of deceptive and non-deceptive speech, comprised of conversations between adult native speakers of American English and of Mandarin Chinese. We are applying machine learning techniques to automatically identify deceptive statements, and exploring individual differences between cultures, genders, and personalities in deceptive behavior.
Collaborators: Julia Hirschberg, Andrew Rosenberg, Michelle Levine, Guozhen An
Automatic identification of speaker traits such as gender, age and emotional state from speech is an important problem for personalized speech-driven services. In this work, we present a novel approach that leverages pitch feature trajectories with the goal of identifying the speaker’s gender with as little speech as possible.
We use the f0 (fundamental frequency) trajectory, the most discriminative feature between male and female speech, but instead of computing summary statistics of the f0 trajectory, we use the entire trajectory as input to the classifier. We model these trajectories as “text” input with each token corresponding to the binned f0 value. Our results show that the trajectory approach can be useful for obtaining fairly accurate gender predictions with as little as one second of speech.
Collaborators: Taniya Mishra, Srinivas Bangalore
In conversation, people tend to become similar to their dialogue partner by adopting lexical, acoustic, prosodic, and syntactic characteristics of the interlocutor’s speech. Research shows that this phenomenon, known as entrainment, is associated with task success and dialogue quality. We studied entrainment patterns in the Supreme Court corpus, and examined relationships between trial success and entrainment between lawyers and justices. We used Amazon Mechanical Turk to preprocess the data and excise noisy areas in the audio files that skew the analysis process. We found that lawyers entrain more than justices, supporting the theory that the less dominant interlocutor is more likely to entrain to the more dominant speaker.
Collaborators: Julia Hirschberg, Rivka Levitan