Vincent Vanhoucke

Vincent Vanhoucke is a Distinguished Scientist, and Senior Director for Robotics at Google DeepMind. Prior to that, he led Google Brain's vision and perception research, and the speech recognition quality team for Google Search by Voice. He holds a Ph.D. in Electrical Engineering from Stanford University and a Diplôme d'Ingénieur from the Ecole Centrale Paris.

Research Areas

Authored Publications

Google Publications

Other Publications

Robotic Table Tennis: A Case Study into a High Speed Learning System

David B. D'Ambrosio

Jon Abelian

Saminda Abeyruwan

Michael Ahn

Alex Bewley

Justin Boyd

Krzysztof Choromanski

Erwin Johan Coumans

Tianli Ding

Omar Escareno

Wenbo Gao

Laura Graesser

Atil Iscen

Navdeep Jaitly

Deepali Jain

Juhana Kangaspunta

Satoshi Kataoka

Gus Kouretas

Yuheng Kuang

Nevena Lazic

Corey Lynch

Reza Mahjourian

Sherry Moore

Thinh Nguyen

Ken Oslund

Barney J. Reed

Krista Reymann

Pannag Sanketi

Anish Shankar

Pierre Sermanet

Vikas Sindhwani

Avi Singh

Vincent Vanhoucke

Grace Vesom

Peng Xu

Robotics: Science and Systems (2023)

Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items

Anthony G. Francis

Brandon Kinman

Krista Ann Reymann

Laura Downs

Nathan Koenig

Ryan M. Hickman

Thomas B. McHugh

Vincent Olivier Vanhoucke

(2022)

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

Andy Zeng

Maria Attarian

Brian Ichter

Krzysztof Marcin Choromanski

Adrian Wong

Stefan Welker

Federico Tombari

Aveek Purohit

Michael Ryoo

Vikas Sindhwani

Johnny Chung Lee

Vincent Olivier Vanhoucke

Pete Florence

arXiv (2022)

Learning to Fold Real Garments with One Arm: A Case Study in Cloud-Based Robotics Research

Ryan Hoque

Kaushik Shivakumar

Shrey Aeron

Gabriel Deza

Aditya Ganapathi

Adrian Wong

Johnny Lee

Andy Zeng

Vincent Vanhoucke

Ken Goldberg

IEEE International Conference on Intelligent Robots and Systems (IROS) (2022) (to appear)

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

Alex Irpan

Alexander Herzog

Alexander Toshkov Toshev

Andy Zeng

Anthony Brohan

Brian Andrew Ichter

Byron David

Carolina Parada

Chelsea Finn

Clayton Tan

Diego Reyes

Dmitry Kalashnikov

Eric Victor Jang

Fei Xia

Jarek Liam Rettinghouse

Jasmine Chiehju Hsu

Jornell Lacanlale Quiambao

Julian Ibarz

Kanishka Rao

Karol Hausman

Keerthana Gopalakrishnan

Kuang-Huei Lee

Kyle Alan Jeffrey

Linda Luu

Mengyuan Yan

Michael Soogil Ahn

Nicolas Sievers

Nikhil J Joshi

Noah Brown

Omar Eduardo Escareno Cortes

Peng Xu

Peter Pastor Sampedro

Pierre Sermanet

Rosario Jauregui Ruano

Ryan Christopher Julian

Sally Augusta Jesmonth

Sergey Levine

Steve Xu

Ted Xiao

Vincent Olivier Vanhoucke

Yao Lu

Yevgen Chebotar

Yuheng Kuang

Conference on Robot Learning (CoRL) (2022)

Mechanical Search on Shelves using LAX-RAY: Lateral Access X-RAY

Huang Huang

Marcus Dominguez-Kuhne

Vishal Satish

Michael Danielczuk

Kate Sanders

Jeff Ichnowski

Andrew Lee

Anelia Angelova

Vincent Olivier Vanhoucke

Ken Goldberg

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2021)

Differentiable Mapping Networks: Learning Structured Map Representations for Sparse Visual Localization

Peter Karkus

Anelia Angelova

Vincent Vanhoucke

Rico Jonschkowski

International Conference on Robotics and Automation (ICRA) (2020)

X-Ray: Mechanical Search for an Occluded Object by Minimizing Support of Learned Occupancy Distributions

Michael Danielczuk

Anelia Angelova

Vincent Olivier Vanhoucke

Ken Goldberg

International Conference on Intelligent Robots and Systems (IROS) (2020)

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

Coline Manon Devin

Eric Jang

Sergey Levine

Vincent Vanhoucke

CoRL (2018)

Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping

Konstantinos Bousmalis

Alex Irpan

Paul Wohlhart

Yunfei Bai

Matthew Kelcey

Mrinal Kalakrishnan

Laura Downs

Julian Ibarz

Peter Pastor Sampedro

Kurt Konolige

Sergey Levine

Vincent Vanhoucke

ICRA (2018)

Sim-to-Real: Learning Agile Locomotion For Quadruped Robots

Jie Tan

Tingnan Zhang

Erwin Coumans

Atil Iscen

Yunfei Bai

Danijar Hafner

Steven Bohez

Vincent Vanhoucke

RSS (2018)

Classification of crystallization outcomes using deep convolutional neural networks

Andrew E. Bruno

Patrick Charbonneau

Janet Newman

Edward H. Snell

David Richard So

Vincent Vanhoucke

Christopher J. Watkins

Shawn Williams

Julie Wilson

PLOS One (2018)

Policies Modulating Trajectory Generators

Atil Iscen

Ken Caluwaerts

Jie Tan

Tingnan Zhang

Erwin Coumans

Vikas Sindhwani

Vincent Vanhoucke

2nd Annual Conference on Robot Learning, CoRL 2018, PMLR, pp. 916-926

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

Dmitry Kalashnikov

Alex Irpan

Peter Pastor Sampedro

Julian Ibarz

Alexander Herzog

Eric Jang

Deirdre Quillen

Ethan Holly

Mrinal Kalakrishnan

Vincent Vanhoucke

Sergey Levine

CORL (2018)

TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow

Danijar Hafner

James Davidson

Vincent Vanhoucke

arXiv preprint arXiv:1709.02878 (2017)

YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Dataset for Object Detection in Video

Esteban Real

Jon Shlens

Stefano Mazzocchi

Vincent Vanhoucke

Xin Pan

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464-7473

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning

Christian Szegedy

Sergey Ioffe

Vincent Vanhoucke

Alex A. Alemi

ICLR 2016 Workshop

Rethinking the Inception Architecture for Computer Vision

Christian Szegedy

Vincent Vanhoucke

Sergey Ioffe

Jonathon Shlens

Zbigniew Wojna

Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (2016)

Pedestrian Detection with a Large-Field-Of-View Deep Network

Anelia Angelova

Alex Krizhevsky

Vincent Vanhoucke

Proceedings of ICRA 2015

Going Deeper with Convolutions

Christian Szegedy

Wei Liu

Yangqing Jia

Pierre Sermanet

Scott Reed

Dragomir Anguelov

Dumitru Erhan

Vincent Vanhoucke

Andrew Rabinovich

Computer Vision and Pattern Recognition (CVPR) (2015)

Real-Time Pedestrian Detection With Deep Network Cascades

Anelia Angelova

Alex Krizhevsky

Vincent Vanhoucke

Abhijit Ogale

Dave Ferguson

Proceedings of BMVC 2015

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Martín Abadi

Ashish Agarwal

Paul Barham

Eugene Brevdo

Zhifeng Chen

Craig Citro

Greg Corrado

Andy Davis

Jeffrey Dean

Matthieu Devin

Sanjay Ghemawat

Ian Goodfellow

Andrew Harp

Geoffrey Irving

Michael Isard

Yangqing Jia

Rafal Jozefowicz

Lukasz Kaiser

Manjunath Kudlur

Josh Levenberg

Dan Mané

Rajat Monga

Sherry Moore

Derek Murray

Chris Olah

Mike Schuster

Jonathon Shlens

Benoit Steiner

Ilya Sutskever

Kunal Talwar

Paul Tucker

Vincent Vanhoucke

Vijay Vasudevan

Fernanda Viégas

Oriol Vinyals

Pete Warden

Martin Wattenberg

Martin Wicke

Yuan Yu

Xiaoqiang Zheng

tensorflow.org (2015)

Autoregressive Product of Multi-frame Predictions Can Improve the Accuracy of Hybrid Models

Navdeep Jaitly

Vincent Vanhoucke

Geoffrey Hinton

Proceedings of Interspeech 2014

Asynchronous Stochastic Optimization for Sequence Training of Deep Neural Networks

Georg Heigold

Erik McDermott

Vincent Vanhoucke

Andrew Senior

Michiel Bacchiani

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Firenze, Italy (2014)

On Rectified Linear Units For Speech Processing

M.D. Zeiler

M. Ranzato

R. Monga

M. Mao

K. Yang

Q.V. Le

P. Nguyen

A. Senior

V. Vanhoucke

J. Dean

G.E. Hinton

38th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver (2013)

Multiframe Deep Neural Networks for Acoustic Modeling

Vincent Vanhoucke

Matthieu Devin

Georg Heigold

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Vancouver, CA (2013)

Multilingual acoustic models using distributed deep neural networks

Georg Heigold

Vincent Vanhoucke

Andrew Senior

Patrick Nguyen

Marc'aurelio Ranzato

Matthieu Devin

Jeff Dean

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Vancouver, CA (2013)

Investigations on Exemplar-Based Features for Speech Recognition Towards Thousands of Hours of Unsupervised, Noisy Data

Georg Heigold

Patrick Nguyen

Mitchel Weintraub

Vincent Vanhoucke

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Kyoto, Japan (2012), pp. 4437-4440

Deep Neural Networks for Acoustic Modeling in Speech Recognition

Geoffrey Hinton

Li Deng

Dong Yu

George Dahl

Abdel-rahman Mohamed

Navdeep Jaitly

Andrew Senior

Vincent Vanhoucke

Patrick Nguyen

Tara Sainath

Brian Kingsbury

Signal Processing Magazine (2012)

Application Of Pretrained Deep Neural Networks To Large Vocabulary Speech Recognition

Navdeep Jaitly

Patrick Nguyen

Andrew Senior

Vincent Vanhoucke

Proceedings of Interspeech 2012

Improving the speed of neural networks on CPUs

Vincent Vanhoucke

Andrew Senior

Mark Z. Mao

Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011

Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models

Francoise Beaufays

Vincent Vanhoucke

Brian Strope

Proc Interspeech (2010)

Reading Text in Consumer Digital Photographs

Vincent Vanhoucke

S. Burak Gokturk

Proceedings of SPIE DRR XIV (2007)

Automatic Training Set Segmentation For Multi-Pass Speech Recognition

Mark Z. Mao

Vincent Vanhoucke

Brian Strope

Proceedings of ICASSP 2005

Confidence Scoring and Rejection using Multi-Pass Speech Recognition

Vincent Vanhoucke

Proceedings of Interspeech 2005

Mixtures of Inverse Covariances

Vincent Vanhoucke

Ananth Sankar

IEEE Transactions on Speech and Audio Processing, vol. 13 (2004), pp. 250-264

Design of Compact Acoustic Models through Clustering of Tied-Covariance Gaussians

Mark Z. Mao

Vincent Vanhoucke

Proceedings of ICSLP 2004

Variable Length Mixtures of Inverse Covariances

Vincent Vanhoucke

Ananth Sankar

Processings of Eurospeech 2003

Interpretability in Multidimensional Classification

Vincent Vanhoucke

Rosaria Silipo

Interpretability Issues in Fuzzy Modeling, Springer-Verlag (2003), pp. 193-217

Mixtures of Inverse Covariances: Covariance Modeling for Gaussian Mixtures with Applications to Automatic Speech Recognition

Vincent Vanhoucke

Ph.D. Thesis, Stanford University (2003)

Mixtures of Inverse Covariances

Vincent Vanhoucke

Ananth Sankar

Proceedings of ICASSP2003, also in Proceedings of NNSP 2003

Effects of Prompt Style when Navigating through Structured Data

Vincent Vanhoucke

W. Lawrence Neeley

Maria Mortati

Michael J. Sloan

Clifford Nass

Proceedings of INTERACT 2001, Eighth IFIP TC.13 Conference on Human Computer Interaction, pp. 530-536

Speaker-Trained Recognition using Allophonic Enrollment Models

Vincent Vanhoucke

Michael M. Hochberg

Christopher J. Leggetter

Proceedings of ASRU2001

Search on Google Scholar

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Vincent Vanhoucke

Research Areas

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Vincent Vanhoucke

Research Areas

Filter by:

Year

Team

Research Area

Join us

AI/ML Foundations  & Capabilities