Spherical Random Features for Polynomial Kernels
Venue
Neural Information Processing Systems (NIPS) (2015)
Publication Year
2015
Authors
Jeffrey Pennington, Felix X. Yu, Sanjiv Kumar
BibTeX
Abstract
Compact explicit feature maps provide a practical framework to scale kernel methods
to large-scale learning, but deriving such maps for many types of kernels remains a
challenging open problem. Among the commonly used kernels for nonlinear
classification are polynomial kernels, for which low approximation error has thus
far necessitated explicit feature maps of large dimensionality, especially for
higher-order polynomials. Meanwhile, because polynomial kernels are unbounded, they
are frequently applied to data that has been normalized to unit l2 norm. The
question we address in this work is: if we know a priori that data is normalized,
can we devise a more compact map? We show that a putative affirmative answer to
this question based on Random Fourier Features is impossible in this setting, and
introduce a new approximation paradigm, Spherical Random Fourier (SRF) features,
which circumvents these issues and delivers a compact approximation to polynomial
kernels for data on the unit sphere. Compared to prior work, SRF features are less
rank-deficient, more compact, and achieve better kernel approximation, especially
for higher-order polynomials. The resulting predictions have lower variance and
typically yield better classification accuracy.
