About This Site

This is a set of how-to instructions and other resources for building voices with Festival and HTS. These instructions are fairly specific to the research we are doing, and as such do not cover everything that one might want to do with these tools. This is not a manual, but rather a guide and set of recipes for the various tasks involved in the experiments we are doing. This site is mainly intended for Speech Lab students, however anyone may feel free to use it. Please address any questions about using the various software tools to their respective help lists.

Tools

HTS: The Hidden Markov Model based speech synthesis toolkit. Main website and searchable email help list. There is no official manual for HTS but it is generally recommended to look at the HTK Book.

Festival: General speech synthesis framework. Main website and mailing lists.

Merlin: Neural Network based speech synthesis toolkit. Github page and issues page with Q&A.

Corpora

Many of the corpora we are using are available through the LDC: CALLHOME, MACROPHONE, BABEL, and Turkish broadcast news.

The CMU ARCTIC databases are also available online.

Publications

Some publications related to this work:

Data Selection and Adaptation for Naturalness in HMM-based Speech Synthesis.
Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg. Interspeech, September 2016, San Francisco, California.
paper poster

Data Selection for Naturalness in HMM-based Speech Synthesis.
Erica Cooper, Yocheved Levitan, Julia Hirschberg. Speech Prosody, June 2016, Boston, Massachusetts.
paper poster