Learning Multiple Non-Linear Sub-Spaces using K-RBMs
Venue
Computer Vision and Pattern Recognition (2013)
Publication Year
2013
Authors
Siddhartha Chandra, Shailesh Kumar, C. V. Jawahar
BibTeX
Abstract
Understanding the nature of data is the key to building good representations. In
domains such as natural images, the data comes from very complex distributions
which are hard to capture. Feature learning intends to discover or best approximate
these underlying distributions and use their knowledge to weed out irrelevant
information, preserving most of the relevant information. Feature learning can thus
be seen as a form of dimensionality reduction. In this paper, we describe a feature
learning scheme for natural images. We hypothesize that image patches do not all
come from the same distribution, they lie in multiple nonlinear subspaces. We
propose a framework that uses K-Restricted Boltzmann Machines (K-RBMS) to learn
multiple non-linear subspaces in the raw image space. Projections of the image
patches into these subspaces gives us features, which we use to build image
representations. Our algorithm solves the coupled problem of finding the right
non-linear subspaces in the input space and associating image patches with those
subspaces in an iterative EM like algorithm to minimize the overall reconstruction
error. Extensive empirical results over several popular image classification
datasets show that representations based on our framework outperform the
traditional feature representations such as the SIFT based Bag-of-Words (BoW) and
convolutional deep belief networks.
