We study the problem of compressing recurrent neural networks (RNNs). In
particular, we focus on the compression of RNN acoustic models, which are motivated
by the goal of building compact and accurate speech recognition systems which can
be run efficiently on mobile devices. In this work, we present a technique for
general recurrent model compression that jointly compresses both recurrent and
non-recurrent inter-layer weight matrices. We find that the proposed technique
allows us to reduce the size of our Long Short-Term Memory (LSTM) acoustic model to
a third of its original size with negligible loss in accuracy.