DaMN – Discriminative and Mutually Nearest: Exploiting Pairwise Category Proximity for Video Action Recognition
Venue
Proceedings of European Conference on Computer Vision (2014)
Publication Year
2014
Authors
Rui Hou, Amir Roshan Zamir, Rahul Sukthankar, Mubarak Shah
BibTeX
Abstract
We propose a method for learning discriminative category-level features and
demonstrate state-of-the-art results on large-scale action recognition in video.
The key observation is that one-vs-rest classifiers, which are ubiquitously
employed for this task, face challenges in separating very similar categories (such
as running vs. jogging). Our proposed method automatically identifies such pairs of
categories using a criterion of mutual pairwise proximity in the (kernelized)
feature space, using a category-level similarity matrix where each entry
corresponds to the one-vs-one SVM margin for pairs of categories. We then exploit
the observation that while splitting such "Siamese Twin" categories may be
difficult, separating them from the remaining categories in a two-vs-rest framework
is not. This enables us to augment one-vs-rest classifiers with a judicious
selection of "two-vs-rest" classifier outputs, formed from such discriminative and
mutually nearest (DaMN) pairs. By combining one-vs-rest and two-vs-rest features in
a principled probabilistic manner, we achieve state-of-the-art results on the
UCF101 and HMDB51 datasets. More importantly, the same DaMN features, when treated
as a mid-level representation also outperform existing methods in knowledge
transfer experiments, both cross-dataset from UCF101 to HMDB51 and to new
categories with limited training data (one-shot and few-shot learning). Finally, we
study the generality of the proposed approach by applying DaMN to other
classification tasks; our experiments show that DaMN outperforms related approaches
in direct comparisons, not only on video action recognition but also on their
original image dataset tasks.
