The Power of Comparative Reasoning
Venue
International Conference on Computer Vision, IEEE (2011)
Publication Year
2011
Authors
Jay Yagnik, Dennis Strelow, David Ross, Ruei-Sung Lin
BibTeX
Abstract
Rank correlation measures are known for their resilience to perturbations in
numeric values and are widely used in many evaluation metrics. Such ordinal
measures have rarely been applied in treatment of numeric features as a
representational transformation. We emphasize the benefits of ordinal
representations of input features both theoretically and empirically. We present a
family of algorithms for computing ordinal embeddings based on partial order
statistics. Apart from having the stability benefits of ordinal measures, these
embeddings are highly nonlinear, giving rise to sparse feature spaces highly
favored by several machine learning methods. These embeddings are deterministic,
data independent and by virtue of being based on partial order statistics, add
another degree of resilience to noise. These machine-learning-free methods when
applied to the task of fast similarity search outperform state-of-theart machine
learning methods with complex optimization setups. For solving classification
problems, the embeddings provide a nonlinear transformation resulting in sparse
binary codes that are well-suited for a large class of machine learning algorithms.
These methods show significant improvement on VOC 2010 using simple linear
classifiers which can be trained quickly. Our method can be extended to the case of
polynomial kernels, while permitting very efficient computation. Further, since the
popular MinHash algorithm is a special case of our method, we demonstrate an
efficient scheme for computing MinHash on conjunctions of binary features. The
actual method can be implemented in about 10 lines of code in most languages (2
lines in MATLAB), and does not require any data-driven optimization.
