This paper shows recent advances for large scale neural language modeling, a task
central to language understanding. Our goal is to show how well large neural
language models can perform on a large LM benchmark corpus, for which we chose the
One Billion Word Benchmark. Using various techniques, our best single model
significantly improves state-of-the-art perplexity from 51.3 to 30.0, while an
ensemble of models sets a new record by improving perplexity from 41.0 to 23.7.