Research in Machine Perception tackles the hard problems of understanding images, sounds, music and video, as well as providing more powerful tools for image capture, compression, processing, creative expression, and augmented reality.
Our technology powers products across Alphabet, including image understanding in Search and Google Photos, camera enhancements for the Pixel Phone, handwriting interfaces for Android, optical character recognition for Google Drive, video understanding and summarization for YouTube, Google Cloud, Google Photos and Nest, as well as mobile apps including Motion Stills, PhotoScan and Allo.
We actively contribute to the open source and research communities. Our pioneering deep learning advances, such as Inception and Batch Normalization, are available in TensorFlow. Further, we have released several large-scale datasets for machine learning, including: AudioSet (audio event detection); AVA (human action understanding in video); Open Images (image classification and object detection); and YouTube-8M (video labeling).