Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Venue
ICLR 2016 Workshop (to appear)
Publication Year
2016
Authors
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke
BibTeX
Abstract
Very deep convolutional networks have been central to the largest advances in image
recognition performance in recent years. One example is the Inception architecture
that has been shown to achieve very good performance at relatively low
computational cost. Recently, the introduction of residual connections in
conjunction with a more traditional architecture has yielded state-of-the-art
performance in the 2015 ILSVRC challenge; its performance was similar to the latest
generation Inception-v3 network. This raises the question of whether there are any
benefit in combining the Inception architecture with residual connections. Here we
give clear empirical evidence that training with residual connections accelerates
the training of Inception networks significantly. There is also some evidence of
residual Inception networks outperforming similarly expensive Inception networks
without residual connections by a thin margin. We also present several new
streamlined architectures for both residual and non-residual Inception networks.
These variations improve the single-frame recognition performance on the ILSVRC
2012 classification task significantly. We further demonstrate how proper
activation scaling stabilizes the training of very wide residual Inception
networks. With an ensemble of three residual and one Inception-v4, we achieve 3.08
percent top-5 error on the test set of the ImageNet classification (CLS) challenge.
