Learning fine-grained image similarity is a challenging task. It needs to capture
between-class and within-class image differences. This paper proposes a deep
ranking model that employs deep learning techniques to learn similarity metric
directly from images. It has higher learning capability than models based on
hand-crafted features. A novel multiscale network structure has been developed to
describe the images effectively. An efficient triplet sampling algorithm is
proposed to learn the model with distributed asynchronized stochastic gradient.
Extensive experiments show that the proposed algorithm outperforms models based on
hand-crafted visual features and deep classification models.