Generalization Bounds for Learning Kernels
Abstract: This paper presents several novel generalization bounds for
the problem of learning kernels based on a combinatorial analysis of the Rademacher
complexity of the corresponding hypothesis sets. Our bound for learning kernels with a
convex combination of p base kernels using L1 regularization admits only a √log p
dependency on the number of kernels, which is tight and considerably more favorable
than the previous best bound given for the same problem. We also give a novel bound for
learning with a non-negative combination of p base kernels with an L2 regularization
whose dependency on p is also tight and only in p^(1/4). We present similar results for
Lq regularization with other values of q, and outline the relevance of our proof
techniques to the analysis of the complexity of the class of linear functions.
Experiments with a large number of kernels further validate the behavior of the
generalization error as a function of p predicted by our bounds.