An Online Algorithm for Large Scale Image Similarity Learning
Venue
Advances in Neural Information Processing Systems (2009)
Publication Year
2009
Authors
Gal Chechik, Varun Sharma, Uri Shalit, Samy Bengio
BibTeX
Abstract
Learning a measure of similarity between pairs of objects is a fundamental problem
in machine learning. It stands in the core of classifications methods like kernel
machines, and is particularly useful for applications like searching for images
that are similar to a given image or finding videos that are relevant to a given
video. In these tasks, users look for objects that are not only visually similar
but also semantically related to a given object. Unfortunately, current approaches
for learning similarity do not scale to large datasets, especially when imposing
metric constraints on the learned similarity. We describe OASIS, a method for
learning pairwise similarity that is fast and scales linearly with the number of
objects and the number of non-zero features. Scalability is achieved through online
learning of a bilinear model over sparse representations using a large margin
criterion and an efficient hinge loss cost. OASIS is accurate at a wide range of
scales: on a standard benchmark with thousands of images, it is more precise than
state-of-the-art methods, and faster by orders of magnitude. On 2 millions images
collected from the web, OASIS can be trained within 3 days on a single CPU. The
non-metric similarities learned by OASIS can be transformed into metric
similarities, achieving higher precisions than similarities that are learned as
metrics in the first place. This suggests an approach for learning a metric from
data that is larger by two orders of magnitude than was handled before.
