Net2Net: Accelerating Learning via Knowledge Transfer
Venue
International Conference on Learning Representations (2016)
Publication Year
2016
Authors
Tianqi Chen, Ian Goodfellow, Jonathon Shlens
BibTeX
Abstract
We introduce techniques for rapidly transferring the information stored in one
neural net into another neural net. The main purpose is to accelerate the training
of a significantly larger neural net. During real-world workflows, one often trains
very many different neural networks during the experimentation and design process.
This is a wasteful process in which each new model is trained from scratch. Our
Net2Net technique accelerates the experimentation process by instantaneously
transferring the knowledge from a previous network to each new deeper or wider
network. Our techniques are based on the concept of function-preserving
transformations between neural network specifications. This differs from previous
approaches to pre-training that altered the function represented by a neural net
when adding layers to it. Using our knowledge transfer mechanism to add depth to
Inception modules, we demonstrate a new state of the art accuracy rating on the
ImageNet dataset.
