Decision Tree State Clustering with Word and Syllable Features
Venue
Interspeech, ISCA (2010), 2958 – 2961
Publication Year
2010
Authors
Hank Liao, Chris Alberti, Michiel Bacchiani, Olivier Siohan
BibTeX
Abstract
In large vocabulary continuous speech recognition, decision trees are widely used
to cluster triphone states. In addition to commonly used phonetically based
questions, others have proposed additional questions such as phone position within
word or syllable. This paper examines using the word or syllable context itself as
a feature in the decision tree, providing an elegant way of introducing word- or
syllable-specific models into the system. Positive results are reported on two
state-of-the-art systems: voicemail transcription and a search by voice tasks
across a variety of acoustic model and training set sizes.
