A Neural Transducer
Venue
NIPS 2016 (2016) (to appear)
Publication Year
2016
Authors
Navdeep Jaitly, David Sussillo, Quoc V. Le, Oriol Vinyals, Ilya Sutskever, Samy Bengio
BibTeX
Abstract
Sequence-to-sequence models have achieved impressive results on various tasks.
However, they are unsuitable for tasks that require incremental predictions to be
made as more data arrives or tasks that have long input sequences and output
sequences. This is because they generate an output sequence conditioned on an
entire input sequence. In this paper, we present a Neural Transducer that can make
incremental predictions as more input arrives, without redoing the entire
computation. Unlike sequence-to-sequence models, the Neural Transducer computes the
next-step distribution conditioned on the partially observed input sequence and the
partially generated sequence. At each time step, the transducer can decide to emit
zero to many output symbols. The data can be processed using an encoder and
presented as input to the transducer. The discrete decision to emit a symbol at
every time step makes it difficult to learn with conventional backpropagation. It
is however possible to train the transducer by using a dynamic programming
algorithm to generate target discrete decisions. Our experiments show that the
Neural Transducer works well in settings where it is required to produce output
predictions as data come in. We also find that the Neural Transducer performs well
for long sequences even when attention mechanisms are not used.
