Learning the representation and the similarity metric in an end-to-end fashion with
deep networks have demonstrated outstanding results for clustering and retrieval.
However, these recent approaches still suffer from the performance degradation
stemming from the local metric training procedure which is unaware of the global
structure of the embedding space. We propose a global metric learning scheme for
optimizing the deep metric embedding with the learnable clustering function and the
clustering metric (NMI) in a novel structured prediction framework. Our experiments
on CUB200-2011, Cars196, and Stanford online products datasets show state of the
art performance both on the clustering and retrieval tasks measured in the NMI and
Recall@K evaluation metrics.