Creating Babel Language gen .lab Files for Synthesis

Please note: these instructions are for Columbia Speech Lab students only. Many of these tools and data sets are not yet publicly available.

1. Make your Festival-format .data file.

For example:

( gen_001 "sentence to synthesize." )
( gen_002 "another new sentence." )
( gen_003 "here is one more." )
....
Except these won't be in English obviously.

2. Create utts

You presumably already have a frontend setup in babel_scripts that you have used to create your training labels. Use this same frontend setup to generate test utts. e.g.:

cd /proj/tts/tools/babel_scripts/ecooper/turkish_bn_clean
cp your_text.done.data etc/txt.done.data.test

First, move the training data files out of the way so they don't get clobbered when you run the scripts on your test utterances:

mv prompt-utt prompt-utt-train
mv lab lab-train
mv festival/utts festival/utts-train
mkdir prompt-utt
mkdir lab
mkdir festival/utts
Create prompts, fake alignment files, and test data utts:

./bin/do_build build_prompts etc/txt.done.data.test
for u in prompt-utt/*.utt; do
   b=`basename $u .utt`
   grep "dur_factor" $u | sed 's:"::g' | awk '{print "0 0 " $6}' >> lab/$b.lab
done
./bin/do_build build_utts etc/txt.done.data.test

Just for clarity, rename the test output files: mv prompt-utt prompt-utt-test
mv lab lab-test
mv festival/utts festival/utts-test

Your output .utt files are in festival/utts-test.

3. Phoneme mapping

If you are using the same frontend setup that you used to create your training labels, then the phonesets should match and there should be no problem. Nevertheless, this is just a reminder to check the phoneme names in the .utt files to make sure they are what you expect.

4. Festival .utt format to HTS .lab format

We can convert from Festival format to HTS format using the HTS demo scripts. However, the HTS demo scripts do not include a recipe for creating gen labels since the demo includes pre-made gen labels for US English, so we need to repurpose the 'full' label recipe to create gen labels. These are basically the same format except for that full labels have timestamps and gen labels should not.

Follow the steps in the data Makefile (e.g. /proj/tts/hts-2.3/template_si_htsengine/data/Makefile) for make label. We typically just copy over the Makefile to wherever the utts are, edit it to just do the fullcontext step, and run it.

Make sure the DUMPFEATS command contains the pointer to the Festival voice you want to use for your language. If you are re-using the utt-to-lab setup that you used for creating training labels, then it should already be there. Recall that the Festival voice is only used to note which phonemes are vowels, so it does not have to be the same exact frontend where you created the utt files, as long as the phoneset contains properly named phones with properly identified vowels.

If you see warnings about phoneset Radio, these are safely ignored. This is just a warning indicating that we are using phonemes outside of the default US English set.

The output will be in labels/full. Follow the next steps to remove the timestamps, and put these final labels into yourvoicedir/data/labels/gen.

5. Normalize for Merlin

We used to get rid of the artificial timestamps on these label files for HTS but at time of writing, the Merlin scripts expect them (and later replace them) so just leave them. The label files still need to be normalized for Merlin ('sil' phones renamed etc.) so run the script merlin/misc/scripts/frontend/utils/normalize_lab_for_merlin.py. These normalized labels should finally get dropped into your voice experiment directory under experiments/yourvoicename/test_synthesis/prompt-lab.