SVCCA: Singular Vector Canonical Correlation Analysis for Deep Understanding and Improvement
Abstract
With the continuing empirical successes of deep networks, it becomes increasingly
important to develop better methods for understanding training of models
and the representations learned within. In this paper we propose Singular Vector
Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations
in a way that is both invariant to affine transform (allowing comparison
between different layers and networks) and fast to compute (allowing more
comparisons to be calculated than with previous methods). We deploy this tool
to measure the intrinsic dimensionality of layers, showing in some cases needless
over-parameterization; to probe learning dynamics throughout training, finding
that networks converge to final representations from the bottom up; to show
where class-specific information in networks is formed; and to suggest new training
regimes that simultaneously save computation and overfit less.
important to develop better methods for understanding training of models
and the representations learned within. In this paper we propose Singular Vector
Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations
in a way that is both invariant to affine transform (allowing comparison
between different layers and networks) and fast to compute (allowing more
comparisons to be calculated than with previous methods). We deploy this tool
to measure the intrinsic dimensionality of layers, showing in some cases needless
over-parameterization; to probe learning dynamics throughout training, finding
that networks converge to final representations from the bottom up; to show
where class-specific information in networks is formed; and to suggest new training
regimes that simultaneously save computation and overfit less.