Direct Construction of Compact Context-Dependency Transducers From Data
Abstract
This paper describes a new method for building compact con-text-dependency
transducers for finite-state transducer-based ASR decoders. Instead of the
conventional phonetic decision-tree growing followed by FST compilation, this
approach incorporates the phonetic context splitting directly into the transducer
construction. The objective function of the split optimization is augmented with a
regularization term that measures the number of transducer states introduced by a
split. We give results on a large spoken-query task for various n-phone orders and
other phonetic features that show this method can greatly reduce the size of the
resulting context-dependency transducer with no significant impact on recognition
accuracy. This permits using context sizes and features that might otherwise be
unmanageable.
