About This Site
This is a set of how-to instructions and other resources for building
voices with Festival, HTS, and Merlin. These instructions are fairly specific
to the research we are doing, and as such do not cover everything that
one might want to do with these tools. This is not a manual,
but rather a guide and set of recipes for the various tasks
involved in the experiments we are doing. This site is mainly
intended for Speech Lab students, however anyone may feel free to use it. Please address any questions about
using the various software tools to their respective help lists.
HTS: The Hidden Markov Model based speech synthesis toolkit.
Main website and
help list. There is no official manual for HTS but it is
generally recommended to look at the
Festival: General speech synthesis framework.
and mailing lists.
Merlin: Neural Network based speech synthesis toolkit.
Github page and
page with Q&A.
Many of the corpora we are using are available through the LDC:
and Turkish broadcast
The CMU ARCTIC databases
are also available online.
Some publications related to this work:
Data Selection and Adaptation for Naturalness in
HMM-based Speech Synthesis.
Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg.
Interspeech, September 2016, San
Data Selection for Naturalness in HMM-based Speech
Erica Cooper, Yocheved Levitan, Julia Hirschberg.
Speech Prosody, June 2016, Boston,
Utterance Selection for Optimizing Intelligibility
of TTS Voices Trained on ASR Data.
Erica Cooper, Xinyue Wang, Alison Chang, Yocheved
Levitan and Julia Hirschberg. Interspeech, August
2017, Stockholm, Sweden.