Back-off Language Model Compression
Venue
Proceedings of Interspeech 2009, International Speech Communication Association (ISCA), pp. 325-355
Publication Year
2009
Authors
Boulos Harb, Ciprian Chelba, Jeffrey Dean, Sanjay Ghemawat
BibTeX
Abstract
With the availability of large amounts of training data relevant to speech
recognition scenarios, scalability becomes a very productive way to improve
language model performance. We present a technique that represents a back-off
n-gram language model using arrays of integer values and thus renders it amenable
to effective block compression. We propose a few such compression algorithms and
evaluate the resulting language model along two dimensions: memory footprint, and
speed reduction relative to the uncompressed one. We experimented with a model that
uses a 32-bit word vocabulary (at most 4B words) and
log-probabilities/back-off-weights quantized to 1 byte, respectively. The best
compression algorithm achieves 2.6 bytes/n-gram at ≈18X slower than uncompressed.
For faster LM operation we found it feasible to represent the LM at ≈4.0
bytes/n-gram, and ≈3X slower than the uncompressed LM. The memory footprint of a LM
containing one billion n-grams can thus be reduced to 3–4 Gbytes without impacting
its speed too much. See the presentation material
from a talk about this paper.
