CS4706 Spring 2012


PocketSphinx is an open-source toolkit for speech recognition from Carnegie Mellon University. It's cross-platform, and has built-in language and acoustic models, but you can also use and train your own. You will be using PocketSphinx as the recognizer for your term projects. You can use the lab machines or your own personal computers for the projects. If you choose to work on your own computer, instructions for getting set up with PocketSphinx can be found below: The Python example from the setup instructions shows how to use both the default n-gram language model that was trained on the Wall Street Journal (the commented-out decoder), as well as an example of a custom JSGF grammar and pronunciation dictionary, which is what you will be designing as part of your project.

The PocketSphinx website has extensive documentation, as well as a help forum for questions.

Please note that while we are providing these instructions for getting PocketSphinx set up on your own computer, we will not be able to provide extensive debugging support for your installation, due to variations in everyone's individual settings and configurations. Please also note that even if you work on your own computer, you are still responsible for making sure that your project runs on the lab machines.

Debugging FAQ