Indoor Scene Understanding with Geometric and Semantic Contexts
Venue
International Journal of Computer Vision (IJCV) (2014)
Publication Year
2014
Authors
Wongun Choi, Yu-Wei Chao, Caroline Pantofaru, Silvio Savarese
BibTeX
Abstract
Truly understanding a scene involves integrating information at multiple levels as
well as studying the interactions between scene elements. Individual object
detectors, layout estimators and scene classifiers are powerful but ultimately
confounded by complicated real-world scenes with high variability, different
viewpoints and occlusions. We propose a method that can automatically learn the
interactions among scene elements and apply them to the holistic understanding of
indoor scenes from a single image. This interpretation is performed within a
hierarchical interaction model which describes an image by a parse graph, thereby
fusing together object detection, layout estimation and scene classification. At
the root of the parse graph is the scene type and layout while the leaves are the
individual detections of objects. In between is the core of the system, our 3D
Geometric Phrases (3DGP). We conduct extensive experimental evaluations on single
image 3D scene understanding using both 2D and 3D metrics. The results demonstrate
that our model with 3DGPs can provide robust estimation of scene type, 3D space,
and 3D objects by leveraging the contextual relationships among the visual
elements.
