Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts
Venue
Proceedings of International Computer Vision and Pattern Recognition (CVPR 2014), IEEE
Publication Year
2014
Authors
Subhabrata Bhattacharya, Mahdi M. Kalayeh, Rahul Sukthankar, Mubarak Shah
BibTeX
Abstract
While approaches based on bags of features excel at low-level action
classification, they are ill-suited for recognizing complex events in video, where
concept-based temporal representations currently dominate. This paper proposes a
novel representation that captures the temporal dynamics of windowed mid-level
concept detectors in order to improve complex event recognition. We first express
each video as an ordered vector time series, where each time step consists of the
vector formed from the concatenated confidences of the pre-trained concept
detectors. We hypothesize that the dynamics of time series for different instances
from the same event class, as captured by simple linear dynamical system (LDS)
models, are likely to be similar even if the instances differ in terms of low-level
visual features. We propose a two-part representation composed of fusing: (1) a
singular value decomposition of block Hankel matrices (SSID-S) and (2) a harmonic
signature (H-S) computed from the corresponding eigen-dynamics matrix. The proposed
method offers several benefits over alternate approaches: our approach is
straightforward to implement, directly employs existing concept detectors and can
be plugged into linear classification frameworks. Results on standard datasets such
as NIST's TRECVID Multimedia Event Detection task demonstrate the improved accuracy
of the proposed method.
