Discriminative Segment Annotation in Weakly Labeled Video
Venue
Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR 2013)
Publication Year
2013
Authors
Kevin Tang, Rahul Sukthankar, Jay Yagnik, Li Fei-Fei
BibTeX
Abstract
paper tackles the problem of segment annotation in complex Internet videos. Given a
weakly labeled video, we automatically generate spatiotemporal masks for each of
the concepts with which it is labeled. This is a particularly relevant problem in
the video domain, as large numbers of YouTube videos are now available, tagged with
the visual concepts that they contain. Given such weakly labeled videos, we focus
on the problem of spatiotemporal segment classification. We propose a
straightforward algorithm, CRANE, that utilizes large amounts of weakly labeled
video to rank spatiotemporal segments by the likelihood that they correspond to a
given visual concept. We make publicly available segment-level annotations for a
subset of the Prest et al. dataset and show convincing results. We also show
state-of-the-art results on Hartmann et al.'s more difficult, large-scale object
segmentation dataset.
