Discovering Groups of People in Images
Venue
European Conference on Computer Vision (ECCV) (2014)
Publication Year
2014
Authors
Wongun Choi, Yu-Wei Chao, Caroline Pantofaru, Silvio Savarese
BibTeX
Abstract
Understanding group activities from images is an important yet challenging task.
This is because there is an exponentially large number of semantic and geometrical
relationships among individuals that one must model in order to effectively
recognize and localize the group activities. Rather than focusing on directly
recognizing group activities as most of the previous works do, we advocate the
importance of introducing an intermediate representation for modeling groups of
humans which we call structure groups. Such groups define the way people spatially
interact with each other. People might be facing each other to talk, while others
sit on a bench side by side, and some might stand alone. In this paper we
contribute a method for identifying and localizing these structured groups in a
single image despite their varying viewpoints, number of participants, and
occlusions. We propose to learn an ensemble of discriminative interaction patterns
to encode the relationships between people in 3D and introduce a novel efficient
iterative augmentation algorithm for solving this complex inference problem. A nice
byproduct of the inference scheme is an approximate 3D layout estimate of the
structured groups in the scene. Finally, we contribute an extremely challenging new
dataset that contains images each showing multiple people performing multiple
activities. Extensive evaluation confirms our theoretical findings.
