Erica Cooper 

I was formerly a PhD student researcher with the Speech Lab at Columbia University. My research was on speech synthesis and recognition, with a focus on low-resource languages. I have also previously worked on spoken keyword search, pronunciation modeling, natural language question answering, and deception detection. For my thesis, I worked within a parametric speech synthesis framework to discover ways of creating intelligible and natural-sounding text-to-speech voices from low-resource and found data.

I graduated in February of 2019, and I am now working as a postdoctoral researcher with the Yamagishi Lab at the National Institute of Informatics in Tokyo. My current webpage can be found here.

 curriculum vitae

Thesis Project

Text-to-Speech for Low-Resource Languages

We are working towards developing methods of building intelligible, natural-sounding TTS voices out of limited data. While most commercial TTS voices are built from audio recorded by a professional speaker in a controlled acoustic environment, this data can be very time-consuming and expensive to collect. We are exploring the use of radio broadcast news, speech recorded with mobile phones, and other found data for building TTS voices in diverse languages, investigating data selection and model adaptation techniques for making the most out of noisy data.

Past Projects

Spoken Keyword Search for Low-Resource Languages

The IARPA BABEL program aims to develop spoken keyword search systems for diverse low-resource languages. Our group focuses on the use of prosodic features for improving recognition accuracy and keyword search performance, as well as experiments in cross-lingual adaptation of models for identifying prosodic events.

Charisma in Political Speech

We investigated acoustic and lexical correlates of political success in the 2004 and 2008 primary elections, focusing on speech from debates and television interviews.

Deception Detection

This project examined the feasibility of automatic detection of deception in speech, using linguistic, prosodic, and other acoustic cues.

Natural Language Online Question Answering

Contributed to the START natural language question answering tool at the MIT CSAIL InfoLab Group by adding recovery and repair modules for when web scraping scripts fail due to website format changes.

Open Image Annotation for Machine Vision

Contributed to the LabelMe project at the MIT CSAIL Computer Vision Group by adding usability features to the online image annotation tool.

More recent publications can be found on Researchmap.

A Comparison of Speaker-based and Utterance-based Data Selection for Text-to-Speech Synthesis

Kai-Zhan Lee, Erica Cooper, Julia Hirschberg. Interspeech, September 2018, Hyderabad, India.

Adaptation and Frontend Features to Improve Naturalness in Found-Data Synthesis

Erica Cooper, Julia Hirschberg. Speech Prosody, June 2018, Poznań, Poland.

Characteristics of Text-to-Speech and Other Corpora

Erica Cooper, Emily Li, Julia Hirschberg. Speech Prosody, June 2018, Poznań, Poland.

Utterance Selection for Optimizing Intelligibility of TTS Voices Trained on ASR Data

Erica Cooper, Xinyue Wang, Alison Chang, Yocheved Levitan, Julia Hirschberg. Interspeech, August 2017, Stockholm, Sweden.

Data Selection and Adaptation for Naturalness in HMM-based Speech Synthesis

Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg. Interspeech, September 2016, San Francisco, California.

Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search.

Gideon Mendels, Erica Cooper, Julia Hirschberg. 10th Web as Corpus Workshop (WAC-X), August 2016, Berlin, Germany.

Data Selection for Naturalness in HMM-based Speech Synthesis.

Erica Cooper, Yocheved Levitan, Julia Hirschberg. Speech Prosody, June 2016, Boston, Massachusetts.

Improving Speech Recognition and Keyword Search for Low Resource Languages Using Web Data.

Gideon Mendels, Erica Cooper, Victor Soto, Julia Hirschberg, Mark Gales, Kate Knill, Anton Ragni, Haipeng Wang. Interspeech, September 2015, Dresden, Germany.

Rescoring Confusion Networks for Keyword Search.

Victor Soto, Erica Cooper, Lidia Mangu, Andrew Rosenberg, Julia Hirschberg. Victor Soto, Erica Cooper, Lidia Mangu, Andrew Rosenberg, Julia Hirschberg. International Conference on Acoustics, Speech and Signal Processing, May 2014, Florence, Italy.

Cross-Language Phrase Boundary Detection.

Victor Soto, Erica Cooper, Andrew Rosenberg, Julia Hirschberg. International Conference on Acoustics, Speech and Signal Processing, May 2013, Vancouver, Canada.

Cross-Language Prominence Detection.

Andrew Rosenberg, Erica Cooper, Rivka Levitan, Julia Hirschberg. Speech Prosody, May 2012, Shanghai, China.

Effect of Pronunciations on OOV Queries in Spoken Term Detection.

Dogan Can, Erica Cooper, Abhinav Sethy, Chris White, Bhuvana Ramabhadran, Murat Saraclar. International Conference on Acoustics, Speech and Signal Processing, April 2009, Taipei, Taiwan.

Unsupervised Pronunciation Validation.

Christopher M. White, Abhinav Sethy, Bhuvana Ramabhadran, Patrick Wolfe, Erica Cooper, Murat Saraclar, James K. Baker. International Conference on Acoustics, Speech and Signal Processing, April 2009, Taipei, Taiwan.

Web-derived Pronunciations for Spoken Term Detection.

Dogan Can, Erica Cooper, Arnab Ghoshal, Martin Jansche, Sanjeev Khudanpur, Bhuvana Ramabhadran, Michael Riley, Murat Saraclar, Abhinav Sethy, Morgan Ulinski, Christopher White. Special Interest Group on Information Retrieval, July 2009, Boston, Massachusetts.

Spoken Language Processing
Teaching assistant, Columbia, Spring 2011, Spring 2012.

Computation Structures
Teaching assistant, MIT, Spring 2010.

Introduction to Python
Laboratory assistant, MIT, January 2010.

Artificial Intelligence
Teaching assistant, MIT, Fall 2009.