Publication Data
Finding Meaning on YouTube: Tag Recommendation and Category Discovery
Abstract: We present a system that automatically recommends tags for
YouTube videos solely based on their audiovisual content. We also propose a novel
framework for unsupervised discovery of video categories that exploits knowledge mined
from the World-Wide Web text documents/searches. First, video content to tag
association is learned by training classifiers that map audiovisual content-based
features from millions of videos on YouTube.com to existing uploader-supplied tags for
these videos. When a new video is uploaded, the labels provided by these classifiers
are used to automatically suggest tags deemed relevant to the video. Our system has
learned a vocabulary of over 20,000 tags. Secondly, we mined large volumes of Web pages
and search queries to discover a set of possible text entity categories and a set of
associated is-A relationships that map individual text entities to categories. Finally,
we apply these is-A relationships mined from web text on the tags learned from
audiovisual content of videos to automatically synthesize a reliable set of categories
most relevant to videos -- along with a mechanism to predict these categories for new
uploads. We then present rigorous rating studies that establish that: (a) the average
relevance of tags automatically recommended by our system matches the average relevance
of the uploader-supplied tags at the same or better coverage and (b) the average
precision@K of video categories discovered by our system is 70% with K=5.
