Download
We offer the YouTube8M dataset for download as TensorFlow Record files. We provide downloader
script that fetches the dataset in shards and stores them in the current directory (output of
pwd). It can be restarted if the connection drops. In which case, it
only downloads shards that haven't been downloaded yet. We also provide html index pages
listing all shards, if you'd like to manually download them.
There are two versions of the features: frame-level and video-level features.
The dataset is made available by Google Inc. under a
Creative Commons
Attribution 4.0 International (CC BY 4.0) license.
Starter Code
Starter code for the dataset can be found on our
GitHub page.
In addition to training code, you will also find python scripts for evaluating standard metrics for
comparisons between models.
Frame-level features dataset
Frame-level features are stored as
tensorflow.SequenceExample protocol buffers. A tensorflow.SequenceExample proto is reproduced here in text format:
context: {
feature: {
key : "video_id"
value: {
bytes_list: {
value: [""] # Empty string
}
}
}
feature: {
key : "labels"
value: {
int64_list: {
value: [1, 522, 11, 172] # The meaning of the labels can be found here.
}
}
}
}
feature_lists: {
feature_list: {
key : "rgb"
value: {
feature: {
bytes_list: {
value: [1024 8bit quantized features]
}
}
feature: {
bytes_list: {
value: [1024 8bit quantized features]
}
}
... # Repeated for every second of the video, up to 300
}
feature_list: {
key : "audio"
value: {
feature: {
bytes_list: {
value: [128 8bit quantized features]
}
}
feature: {
bytes_list: {
value: [128 8bit quantized features]
}
}
}
... # Repeated for every second of the video, up to 300
}
}
The total size of the frame-level features is 1.71 Terabytes.
You can download the 2017 Dataset
here.
Video-level features dataset
Video-level features are stored as
tensorflow.Example protocol buffers. A tensorflow.Example proto is reproduced here in text format:
features: {
feature: {
key : "video_id"
value: {
bytes_list: {
value: [""] # Empty string
}
}
}
feature: {
key : "labels"
value: {
int64_list: {
value: [1, 522, 11, 172] # The meaning of the labels can be found here.
}
}
}
feature: {
key : "mean_rgb" # Average of all 'rgb' features for the video
value: {
float_list: {
value: [1024 float features]
}
}
}
feature: {
key : "mean_audio" # Average of all 'audio' features for the video
value: {
float_list: {
value: [128 float features]
}
}
}
}
The total size of the video-level features is 31 Gigabytes.
You can download the 2017 Dataset
here.