Structured Transforms for Small-footprint Deep Learning
Abstract
We consider the task of building compact deep learning pipelines suitable for deployment
on storage and power constrained mobile devices. We propose a uni-
fied framework to learn a broad family of structured parameter matrices that are
characterized by the notion of low displacement rank. Our structured transforms
admit fast function and gradient evaluation, and span a rich range of parameter
sharing configurations whose statistical modeling capacity can be explicitly tuned
along a continuum from structured to unstructured. Experimental results show
that these transforms can significantly accelerate inference and forward/backward
passes during training, and offer superior accuracy-compactness-speed tradeoffs
in comparison to a number of existing techniques. In keyword spotting applications
in mobile speech recognition, our methods are much more effective than
standard linear low-rank bottleneck layers and nearly retain the performance of
state of the art models, while providing more than 3.5-fold compression.