In large vocabulary continuous speech recognition, decision trees are widely used
to cluster triphone states. In addition to commonly used phonetically based
questions, others have proposed additional questions such as phone position within
word or syllable. This paper examines using the word or syllable context itself as
a feature in the decision tree, providing an elegant way of introducing word- or
syllable-specific models into the system. Positive results are reported on two
state-of-the-art systems: voicemail transcription and a search by voice tasks
across a variety of acoustic model and training set sizes.