Neumann Optimizer: A Practical Optimizer for Deep Neural Networks

Shankar Krishnan

Ying Xiao

Rif A. Saurous

International Conference on Learning Representations (ICLR) (2018)

Download Google Scholar

Abstract

Progress in deep learning is slowed by the days or weeks it takes to train large models. The natural solution of using more hardware is limited by diminishing returns, and leads to inefficient use of additional resources. In this paper, we present a large batch, stochastic optimization algorithm that is both faster than widely used algorithms for fixed amounts of computation, and is also able to scale up substantially better as more computational resources become available. Our algorithm implicitly computes the inverse hessian of each mini-batch to produce descent directions. We demonstrate the effectiveness of our algorithm by successfully training large ImageNet models (Inception V3, Resnet-50, Resnet-101 and Inception-Resnet) with mini-batch sizes of up to 32000 with no loss in validation error relative to current baselines, and no increase in the total number of steps. At smaller mini-batch sizes, our optimizer improves the validation error in these models by 0.8-0.9%. Alternatively, we can trade off this accuracy to reduce the number of training steps needed by roughly 10-30%. Our work is practical and easily usable by others -- only one hyperparameter (learning rate) needs tuning, and furthermore, the algorithm is as computationally cheap as the commonly used adam optimizer.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Neumann Optimizer: A Practical Optimizer for Deep Neural Networks

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Neumann Optimizer: A Practical Optimizer for Deep Neural Networks

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities