We present a technique for jointly denoising bursts of images taken from a handheld
camera. In particular, we propose a convolutional neural network architecture for
predicting spatially varying kernels that can both align and denoise frames, a
synthetic data generation approach based on a realistic noise formation model, and
an optimization guided by an annealed loss function to avoid undesirable local
minima. Our model matches or outperforms the state-of-the-art across a wide range
of noise levels on both real and synthetic data.