Sparse Non-negative Matrix Language Modeling For Skip-grams
Venue
Proceedings of Interspeech 2015, ISCA, pp. 1428-1432
Publication Year
2015
Authors
Noam M. Shazeer, Joris Pelemans, Ciprian Chelba
BibTeX
Abstract
We present a novel family of language model (LM) estimation techniques named Sparse
Non-negative Matrix (SNM) estimation. A first set of experiments empirically
evaluating these techniques on the One Billion Word Benchmark [3] shows that with
skip-gram features SNMLMs are able to match the state-of-the art recurrent neural
network (RNN) LMs; combining the two modeling techniques yields the best known
result on the benchmark. The computational advantages of SNM over both maximum
entropy and RNNLM estimation are probably its main strength, promising an approach
that has the same flexibility in combining arbitrary features effectively and yet
should scale to very large amounts of data as gracefully as n-gram LMs do.
