Questions Files

You will need to create / modify a questions file for your voice if:
  1. You are creating a voice for a new language.
  2. You are working with some custom frontend features.
If you are not doing either of those things, then please use the appropriate existing question file.

The questions file is in yourvoicedir/data/questions/questions_qst001.hed. It is language (actually, phoneset) dependent and it needs to be filled in appropriately with your phoneset. See the scripts and instructions under /proj/tts/examples/qfile/ for generating questions files for a new language.

The questions file is used in the decision tree part of the training, and it relates to the fullcontext label file format. The questions file is basically just pattern matching on the fullcontext labels. On the left side is the name of the question (e.g. "LL-Vowel" is basically asking, "Is the phoneme two to the left of the current one a vowel?") and on the right side is a list of things for which a match would mean the answer is "yes" (e.g. all vowels). The fullcontext label file uses symbols to delimit the different features, so that's why everything pertaining to, e.g. the current phoneme (questions starting with "C-") has the possible matches put between - and + (because that's how you find the current phoneme in the current label, according to lab_format.pdf).

The questions basically repeat for LL, L, C, R, and RR, so it is possible to just fill in the first section and use a script to populate the rest of them. In fact, we already have a script here: /proj/tts/resources/qfile/ You only have to fill in a file in a format like turkish_categories, and then make_qfile_from_categories.py can generate the questions file (with the 'repeat' question categories) in the right format. As for figuring out which phonemes go in which category, this will require some looking up the definitions for the categories on Wikipedia or elsewhere.