# Geometry of Neural Network Loss Surfaces via Random Matrix Theory

### Venue

ICML (2017)

### Publication Year

2017

### Authors

Jeffrey Pennington, Yasaman Bahri

### BibTeX

## Abstract

Understanding the geometry of neural network loss surfaces is important for the
development of improved optimization algorithms and for building a theoretical
understanding of why deep learning works. In this paper, we study the geometry in
terms of the distribution of eigenvalues of the Hessian matrix at critical points
of varying energy. We introduce an analytical framework and a set of tools from
random matrix theory that allow us to compute an approximation of this distribution
under a set of simplifying assumptions. The shape of the spectrum depends strongly
on the energy and another key parameter, $\phi$, which measures the ratio of
parameters to data points. Our analysis predicts and numerical simulations support
that for critical points of small index, the number of negative eigenvalues scales
like the $3/2$ power of the energy. We leave as an open problem an explanation for
our observation that, in the context of a certain memorization task, the energy of
minimizers is well-approximated by the function 1/2(1−φ)2