We propose providing additional utterance-level features as inputs to a deep neural
network (DNN) to facilitate speaker, channel and background normalization.
Modifications of the basic algorithm are developed which result in significant
reductions in word error rates (WERs). The algorithms are shown to combine well
with speaker adaptation by backpropagation, resulting in a 9\% relative WER
reduction. We address implementation of the algorithm for a streaming task.