Backoff Inspired Features for Maximum Entropy Language Models
Venue
Proceedings of Interspeech, ISCA (2014)
Publication Year
2014
Authors
Fadi Biadsy, Keith Hall, Pedro Moreno, Brian Roark
BibTeX
Abstract
Maximum Entropy (MaxEnt) language models are linear models that are typically
regularized via well-known L1 or L2 terms in the likelihood objective, hence
avoiding the need for the kinds of backoff or mixture weights used in smoothed
n-gram language models using Katz backoff and similar techniques. Even though
backoff cost is not required to regularize the model, we investigate the use of
backoff features in MaxEnt models, as well as some backoff-inspired variants. These
features are shown to improve model quality substantially, as shown in perplexity
and word-error rate reductions, even in very large scale training scenarios of tens
or hundreds of billions of words and hundreds of millions of features.
