Many recent breakthroughs in machine learning and machine perception have come from the availability of large labeled datasets, such as ImageNet, which has millions of images labeled with thousands of classes, and has significantly accelerated research in image understanding. Google recently announced the YouTube-8M dataset, which spans millions of videos labeled with thousands of classes, and we hope it will spur similar innovation and advancement in video understanding. YouTube-8M represents a cross-section of our society, and was designed with scale and diversity in mind so that lessons we learn on this dataset can transfer to all areas of our lives, from learning, to communication, to entertainment. It covers over 20 broad domains of video content, including entertainment, sports, commerce, hobbies, science, news, jobs & education, health.
We are excited to announce the CVPR 2017 Workshop on YouTube-8M Large-Scale Video Understanding, to be held July 26, 2017, at the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017) in Honolulu, Hawaii. We invite researchers to participate in a large-scale video classification challenge and to report their results at this workshop, as well as to submit papers describing research, experiments, or applications based on YouTube-8M. The classification challenge will be hosted as a kaggle.com competition, sponsored by Google Cloud, and will feature a $100,000 prize pool for the top performers (details here). In order to enable wider participation in the competition, Google Cloud is also offering limited compute credits so participants can optionally do model training and exploration using the Google Cloud Machine Learning platform (this is for the convenience of participants and not a requirement for participation).
Time | Content | Presenter | |
9:00 - 9:05 | Opening Remarks | Paul Natsev | |
9:05 - 9:30 | Overview of YouTube-8M Dataset, Challenge | Challenge Orgnizers | |
Session 1 | |||
9:30 - 10:00 | Invited Talk 1: Video understanding: what we understood and what we still need to learn | Alex Hauptmann | |
10:00 - 10:30 | Invited Talk 2: Structured Models for Human Action Recognition | Cordelia Schmid | |
10:30 - 10:45 | Coffee Break | ||
Session 2 | |||
10:45 - 12:00 |
Oral Session 1
|
|
|
12:00 - 1:00 | Lunch on your own | ||
Session 3 | |||
1:00 - 1:30 | Invited Talk 3: Learning from Synthetic Humans | Ivan Laptev | |
1:30 - 2:00 | Invited Talk 4: Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos | Mubarak Shah | |
2:00 - 2:30 | YouTube-8M Classification Challenge Summary, Organizers' Lightning Talks | Challenge Organizers | |
2:30 - 3:30 | Poster Session | Participants | |
3:30 - 3:45 | Coffee Break | ||
Session 4 | |||
3:45 - 5:00 |
Oral Session 2
|
|
|
5:00 - 5:20 | Closing and Award Ceremony | Paul Natsev |
This track will be organized as a Kaggle competition for large-scale video classification based on the YouTube-8M dataset. Researchers are invited to participate in the classification challenge by training a model on the public YouTube-8M training and validation sets and submitting video classification results on a blind test set. Open-source TensorFlow code, implementing a few baseline classification models for YouTube-8M, along with training and evaluation scripts, is available at Github. For details on getting started with local or cloud-based training, please see our README and the getting started guide on Kaggle. Results will be scored by a Kaggle evaluation server and published on a public leaderboard, updated live for all submissions (scored on a portion of the test set), along with a final (private) leaderboard, published after the competition is closed (scored on the rest of the test set). Top-ranking submissions in the challenge leaderboard will be invited to the workshop to present their method as an oral talk. Please see details on the Kaggle competition page.
We encourage participants to explore the following topics (non-exhaustive list) and to submit papers to this workshop discussing their approaches and result analysis (publication is also a requirement for prize eligibility on the Kaggle competition):
Researchers are invited to submit any papers involving research, experimentation, or applications on the YouTube-8M dataset. Paper submissions will be reviewed by the workshop organizers and accepted papers will be invited for oral or poster presentations at the workshop.
We encourage participants to explore any relevant topics of interest using YouTube-8M dataset, including but not limited to:
Submission to this track does not require participation in the challenge task, but must be related to the YouTube-8M dataset. We welcome new applications that we didn't think of! Paper submissions are expected to have 4 to 8 pages (no strict page limit) in the CVPR formatting style. Demo paper submissions are also welcome.
Google Cloud sponsors awards for the top-performing challenge participants, who agree to:
Note that publication and open-sourcing are not required to participate in the challenge---we welcome all participation, and will score and rank all submissions, regardless of how they are generated, or whether they are published. However, only submissions that meet the above requirements will be eligible for award recognition and cash prizes.
The total prize pool for this competition is $100,000. For more details on prizes and eligibility, refer to the Kaggle competition pages.
Congratulations to winners!
All submissions will be handled electronically; we request a publicly available URL, where we can access the paper. We recommend uploading your paper on arXiv, but other paper hosting arrangements are acceptable (e.g, technical report at your institution, your own website, etc.). There is no strict limit on the number of pages---we recommend 4 to 8 pages, in the CVPR formatting style. Submission of supplementary material will not be reviewed or considered. Please refer to the files in the Author Guidelines page at the CVPR 2017 website for formatting instructions.
Submitted papers will be reviewed by the organizing committee members, and a subset will be selected for oral or poster presentation. Submissions will be evaluated in terms of potential impact (e.g. performance on the classification challenge), technical depth & scalability, novelty, and presentation.
We do not require blind submissions---author names and affiliations may be shown. We do not restrict submissions of relevant work that is under review or will be published elsewhere. Previously published work is also acceptable as long as it is retargeted towards YouTube-8M. There is no strict page limit but we encourage 4 to 8 page submissions. The accepted papers will be linked on the workshop website and will not appear in the official CVPR proceedings.
Challenge Submissions Deadline | June 2, 2017 |
Paper Submission and Open-Sourcing Deadline | June 28, 2017 (Extended) |
Paper Acceptance & Awards Notification | June 30, 2017 |
Paper Camera-Ready Deadline | July 14, 2017 |
Workshop date (co-located with CVPR'17) | July 26, 2017 |
All deadlines are at 11:59 PM UTC/GMT.
Apostol (Paul) Natsev |
Rahul Sukthankar |
Joonseok Lee |
George Toderici |
Sami Abu-El-Haija |
Anja Hauth |
Nisarg Kothari |
Hanhan Li |
Sobhan Naderi Parizi |
Balakrishnan Varadarajan |
Sudheendra Vijayanarasimhan |
Jiang Wang |