Learning using Large Datasets
Abstract
This contribution develops a theoretical framework that takes into account
the effect of approximate optimization on learning algorithms. The analysis
shows distinct tradeoffs for the case of small-scale and large-scale learning
problems. Small-scale learning problems are subject to the usual approximation–
estimation tradeoff. Large-scale learning problems are subject to a qualitatively different
tradeoff involving the computational complexity of the underlying optimization
algorithms in non-trivial ways. For instance, a mediocre optimization algorithms,
stochastic gradient descent, is shown to perform very well on large-scale
learning problems.