Jeffrey Dean
Co-Authors
-
Alexander
Lloyd
-
Andrei Z.
Broder
-
Andrew
Rabinovich
-
Andrew W.
Senior
-
Andy
Davis
-
Benoit Steiner
-
Boulos
Harb
-
Chris
Frost
-
Christopher Olah
-
Ciprian
Chelba
-
Craig
Chambers
-
Craig
Citro
-
Dale
Woodford
-
Derek G.
Murray
-
Dumitru
Erhan
-
Eugene
Brevdo
-
Eugene
Ie
-
Fay W.
Chang
-
Fernanda
Viegas
-
Geoffrey
E. Hinton
-
Geoffrey
Irving
-
Georg
Heigold
-
Greg
Corrado
-
James C.
Corbett
-
Jianmin
Chen
-
Jonathon
Shlens
-
Josh
Levenberg
-
Kai Chen
-
Ke Yang
-
Kunal Talwar
-
Luiz André
Barroso
-
Lukasz
Kaiser
-
Manjunath Kudlur
-
Martin
Wattenberg
-
Martin
Wicke
-
Martín Abadi
-
Matthieu
Devin
-
Michael Burrows
-
Michael
Isard
-
Mike
Schuster
-
Mohammad Norouzi
-
Oriol
Vinyals
-
Paul A. Tucker
-
Paul
Barham
-
Peng Xu
-
Pete
Warden
-
Quoc V. Le
-
Rajat
Monga
-
Robert E.
Gruber
-
Samy Bengio
-
Sanjay
Ghemawat
-
Sean Quinlan
-
Sherry Moore
-
Shun-Tak Leung
-
Tushar Deepak
Chandra
-
Urs Hölzle
-
Vijay
Vasudevan
-
Vincent
Vanhoucke
-
Wilson C.
Hsieh
-
Xiaoqiang Zheng
-
Yangqing
Jia
-
Yoram Singer
-
Yuan Yu
-
Zhifeng
Chen
- The design and implementation of the initial version of Google's advertising serving system.
- The design and implementation of five generations of our crawling, indexing, and query serving systems, covering two and three orders of magnitude growth in number of documents searched, number of queries handled per second, and frequency of updates to the system. I recently gave a talk at WSDM'09 about some of the issues involved in building large-scale retrieval systems (slides).
- The initial development of Google's AdSense for Content product (involving both the production serving system design and implementation as well as work on developing and improving the quality of ad selection based on the contents of pages).
- The development of Protocol Buffers, a way of encoding structured data in an efficient yet extensible format, and a compiler that generates convenient wrappers for manipulating the objects in a variety of languages. Protocol Buffers are used extensively at Google for almost all RPC protocols, and for storing structured information in a variety of persistent storage systems. A version of the protocol buffer implementation has been open-sourced and is available at http://code.google.com/p/protobuf/.
- Some of the initial production serving system work for the Google News product, working with Krishna Bharat to move the prototype system he put together into a deployed system. Some aspects of our search ranking algorithms, notably improved handling for dealing with off-page signals such as anchortext.
- The design and implementation of the first generation of our automated job scheduling system for managing a cluster of machines.
- The design and implementation of prototyping infrastructure for rapid development and experimentation with new ranking algorithms.
- The design and implementation of MapReduce, a system for simplifying the development of large-scale data processing applications. A paper about MapReduce appeared in OSDI'04.
- The design and implementation of BigTable, a large-scale semi-structured storage system used underneath a number of Google products. A paper about BigTable appeared in OSDI'06.
- Some of the production system design for Google Translate, our statistical machine translation system. In particular, I designed and implemented a system for distributed high-speed access to very large language models (too large to fit in memory on a single machine).
- Some internal tools to make it easy to rapidly search our internal source code repository. Many of the ideas from this internal tool were incorporated into our Google Code Search product, including the ability to use regular expressions for searching large corpora of source code.
- The design and implementation of two generations of systems for large-scale training and deployment of deep learning models: DistBelief, and TensorFlow. TensorFlow is now an open source project, hosted on GitHub.
I received a Ph.D. in Computer Science from the University of Washington, working with Craig Chambers on whole-program optimization techniques for object-oriented languages in 1996. I received a B.S., summa cum laude from the University of Minnesota in Computer Science & Economics in 1990. From 1996 to 1999, I worked for Digital Equipment Corporation's Western Research Lab in Palo Alto, where I worked on low-overhead profiling tools, design of profiling hardware for out-of-order microprocessors, and web-based information retrieval. From 1990 to 1991, I worked for the World Health Organization's Global Programme on AIDS, developing software to do statistical modelling, forecasting, and analysis of the HIV pandemic.
In 2009, I was elected to the National Academy of Engineering, and I was also named a Fellow of the Association for Computing Machinery (ACM) and a Fellow of the American Association for the Advancement of Sciences (AAAS).
Selected slides from talks:
- Berkeley AMPLab Cloud Seminar talk, March, 2012: Achieving Rapid Response Times in Large Online Services
- Stanford Computer Science Department Distinguished Computer Scientist Lecture lecture, November, 2010: Building Software Systems at Google and Lessons Learned
- Symposium on Cloud Computing (SOCC) keynote, June, 2010: Evolution and Future Directions of Large-scale Storage and Computation Systems at Google
- Web Search and Data Mining Conference (WSDM) keynote, February, 2009:
Challenges
in Building Large-Scale Information Retrieval Systems
- Google Faculty Summit talk, July, 2008: Some Potential Areas for Future Research
- Stanford CS295 class lecture, Spring, 2007: Software Engineering Advice from Building Large-Scale Distributed Systems
Personal:
I've lived in lots of places in my life: Honolulu, HI; Manila, The Phillipines; Boston, MA; West Nile District, Uganda; Boston (again); Little Rock, AR; Hawaii (again); Minneapolis, MN; Mogadishu, Somalia; Atlanta, GA; Minneapolis (again); Geneva, Switzerland; Seattle, WA; and (currently) Palo Alto, CA. I'm hard-pressed to pick a favorite, though: each place has its plusses and minuses.One of my life goals is to play soccer and basketball on every continent. So far, I've done so in North America, South America, Europe, Asia, and Africa. I'm worried that Antarctica might be tough, though.
Google Publications
-
Large-Scale Deep Learning For Building Intelligent Computer Systems
WSDM (2016), pp. 1
-
TensorFlow: A system for large-scale machine learning
Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zheng
Google Brain (2016)
-
The Beckman report on database research
Daniel Abadi, Rakesh Agrawal, Anastasia Ailamaki, Magdalena Balazinska, Philip A. Bernstein, Michael J. Carey, Surajit Chaudhuri, Jeffrey Dean, AnHai Doan, Michael J. Franklin, Johannes Gehrke, Laura M. Haas, Alon Y. Halevy, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Donald Kossmann, Samuel Madden, Sharad Mehrotra, Tova Milo, Jeffrey F. Naughton, Raghu Ramakrishnan, Volker Markl, Christopher Olston, Beng Chin Ooi, Christopher Ré, Dan Suciu, Michael Stonebraker, Todd Walter, Jennifer Widom
Commun. ACM, vol. 59 (2016), pp. 92-99
-
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, Xiaoqiang Zheng
tensorflow.org (2015)
-
The rise of cloud computing systems
SOSP History Day (2015), 12:1-12:40
-
Distilling the Knowledge in a Neural Network
Geoffrey Hinton, Oriol Vinyals, Jeffrey Dean
NIPS Deep Learning and Representation Learning Workshop (2014)
-
Tsinghua University (2014)
-
The Beckman Report on Database Research
Daniel J. Abadi, Rakesh Agrawal, Anastasia Ailamaki, Magdalena Balazinska, Philip A. Bernstein, Michael J. Carey, Surajit Chaudhuri, Jeffrey Dean, AnHai Doan, Michael J. Franklin, Johannes Gehrke, Laura M. Haas, Alon Y. Halevy, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Donald Kossmann, Samuel Madden, Sharad Mehrotra, Tova Milo, Jeffrey F. Naughton, Raghu Ramakrishnan, Volker Markl, Christopher Olston, Beng Chin Ooi, Christopher Ré, Dan Suciu, Michael Stonebraker, Todd Walter, Jennifer Widom
SIGMOD Record, vol. 43 (2014), pp. 61-70
-
Zero-Shot Learning by Convex Combination of Semantic Embeddings
Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg Corrado, Jeffrey Dean
International Conference on Learning Representations (2014)
-
DeViSE: A Deep Visual-Semantic Embedding Model
Andrea Frome, Greg Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc’Aurelio Ranzato, Tomas Mikolov
Neural Information Processing Systems (NIPS) (2013)
-
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean
Neural and Information Processing System (NIPS) (2013)
-
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov, Kai Chen, Greg S. Corrado, Jeffrey Dean
International Conference on Learning Representations (2013)
-
Multilingual acoustic models using distributed deep neural networks
Georg Heigold, Vincent Vanhoucke, Andrew Senior, Patrick Nguyen, Marc'aurelio Ranzato, Matthieu Devin, Jeff Dean
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Vancouver, CA (2013)
-
On Rectified Linear Units For Speech Processing
M.D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q.V. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, G.E. Hinton
38th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver (2013)
-
Spanner: Google's Globally Distributed Database
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson C. Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, Dale Woodford
ACM Trans. Comput. Syst., vol. 31 (2013), pp. 8
-
Jeffrey Dean, Luiz André Barroso
Communications of the ACM, vol. 56 (2013), pp. 74-80
-
Using Web Co-occurrence Statistics for Improving Image Categorization
Samy Bengio, Jeffrey Dean, Dumitru Erhan, Eugene Ie, Quoc Le, Andrew Rabinovich, Jonathon Shlens, Yoram Singer
arXiv (2013)
-
Achieving Rapid Response Times in Large Online Services
Talk given at Berkeley AMPLab Cloud Seminar, March 26, 2012 (2012)
-
Building high-level features using large scale unsupervised learning
Quoc Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg Corrado, Jeff Dean, Andrew Ng
International Conference in Machine Learning (2012)
-
Large Scale Distributed Deep Networks
Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng
NIPS (2012)
-
Spanner: Google's Globally-Distributed Database
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Dale Woodford, Yasushi Saito, Christopher Taylor, Michal Szymaniak, Ruth Wang
OSDI (2012)
-
Evolution and Future Directions of Large-scale Storage and Computation Systems at Google
Keynote talk given at 1st Symposium on Cloud Computing (SOCC), ACM, pp. 1-1
-
Evolution and future directions of large-scale storage and computation systems at Google
SoCC '10: Proceedings of the 1st ACM symposium on Cloud computing, ACM, New York, NY, USA (2010), pp. 1-1
-
MapReduce: a flexible data processing tool
Commun. ACM, vol. 53 (2010), pp. 72-77
-
Back-off Language Model Compression
Boulos Harb, Ciprian Chelba, Jeffrey Dean, Sanjay Ghemawat
Proceedings of Interspeech 2009, International Speech Communication Association (ISCA), pp. 325-355
-
Challenges in building large-scale information retrieval systems: invited talk
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, ACM, New York, NY, USA (2009), pp. 1-1
-
MapReduce: Simplified Data Processing on Large Clusters
Communications of the ACM, vol. 51, no. 1 (2008), pp. 107-113
-
Distributed Programming with MapReduce
Beautiful Code, O'Reilly (2007), Chapter 23
-
Large Language Models in Machine Translation
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858-867
-
Bigtable: A Distributed Storage System for Structured Data
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber
7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), {USENIX} (2006), pp. 205-218
-
Experiences with MapReduce, an abstraction for large-scale computation
Proc. 15th International Conference on Parallel Architectures and Compilation Techniques, ACM, Seattle, WA (2006), pp. 1
-
MapReduce: Simplified Data Processing on Large Clusters
OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA (2004), pp. 137-150
-
Web Search for a Planet: The Google Cluster Architecture
Luiz Andre Barroso, Jeffrey Dean, Urs Hölzle
IEEE Micro, vol. 23 (2003), pp. 22-28
-
A Comparison of Techniques to Find Mirrored Hosts on the WWW
Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, Monika Rauch Henzinger
JASIS, vol. 51 (2000), pp. 1114-1122
Previous Publications
-
MapReduce and Other Building Blocks for Large-Scale Distributed Systems at Google
USENIX Annual Technical Conference (2007)
-
Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!)
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert Gruber
OSDI (2006), pp. 205-218
-
LPI Linux certification - in a nutshell: a desktop quick reference: pass the LPIC-1 and LPIC-2 exams, 2nd Edition
Steven Pritchard, Bruno Gomes Pessanha, Nicolai Langfeldt, James Stanger, Jeffrey Dean
O'Reilly (2006), I-XVIII, 1-961
-
LPI Linux certification in a nutshell - a desktop quick reference: covers exams 101 102 for LPI level 1
O'Reilly (2001), I-XVI, 1-551
-
A Comparison of Techniques to Find Mirrored Hosts on the WWW
Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, Monika Rauch Henzinger
IEEE Data Eng. Bull., vol. 23 (2000), pp. 21-26
-
The Swift Java Compiler: Design and Implementation
Daniel J. Scales, Keith H. Randall, Sanjay Ghemawat, Jeffrey Dean
HP Labs Technical Reports (2000), pp. 26
-
A Comparison of Techniques to Find Mirrored Hosts on the WWW
Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, Monika Rauch Henzinger
WOWS (1999), pp. 2-12
-
Control of Walking in the Stick Insect: From Behavior and Physiology to Modeling
Jeffrey Dean, Thomas Kindermann, Josef Schmitz, Michael Schumm, Holk Cruse
Auton. Robots, vol. 7 (1999), pp. 271-288
-
Finding Related Pages in the World Wide Web
Jeffrey Dean, Monika Rauch Henzinger
Computer Networks, vol. 31 (1999), pp. 1467-1479
-
Hardware Support for Out-of-Order Instruction Profiling on Alpha 21264a
J. Anderson, L. Berc, Jeffrey Dean, Sanjay Ghemawat, S. Leung, M. Litchenberg, M Vandevoorde, G. Verns, C. Waldspurger, W. Weihl, J. White
HOTCHIPS 99, IEEE (1999)
-
Transparent, Low-Overhead Profiling on Modern Processors
Jennifer Anderson, Lance Berc, George Chrysos, Jeffrey Dean, Sanjay Ghemawat, Jamey Hicks, Shun-tak Leung, mitch Lichtenberg, Mark Vendevoorde, Carl A. Waldspurger, William E. Weihl
Workshop on Profile and Feedback-Directed Compilation, Paris (1998)
-
ProfileMe: hardware support for instruction-level profiling on out-of-order processors
Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Chrysos
MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, IEEE Computer Society, Washington, DC, USA (1997), pp. 292-302
-
ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors
Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Z. Chrysos
MICRO (1997), pp. 292-302
-
Call Graph Construction in Object-Oriented Languages
David Grove, Greg DeFouw, Jeffrey Dean, Craig Chambers
OOPSLA (1997), pp. 108-124
-
Continuous Profiling: Where Have All the Cycles Gone?
Jennifer-Ann M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika Rauch Henzinger, Shun-Tak Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger, William E. Weihl
ACM Transactions on Computer Systems, vol. 15 (1997), pp. 357-390
-
ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors
Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Chrysos
Proc. 30th Annual Symposium on Microarchitecture (1997)
-
Expressive, Efficient Instance Variables
Jeffrey Dean, David Grove, Craig Chambers, Vassily Litvinov
University of Washington (1996)
-
Simplifying Neural Networks for Controlling Walking by Exploiting Physical Properties
Holk Cruse, Christian Bartling, Jeffrey Dean, Thomas Kindermann, Josef Schmitz, Michael Schumm, Hendrik Wagner
ICANN (1996), pp. 433-438
-
Vortex: An Optimizing Compiler for Object-Oriented Languages
Jeffrey Dean, Greg DeFouw, David Grove, Vassily Litvinov, Craig Chambers
OOPSLA, San Jose, CA (1996), pp. 83-100
-
Whole-program optimization of object-oriented languages
Ph.D. Thesis, University of Washington (1996)
-
A Framework for Selective Recompilation in the Presence of Complex Intermodule Dependencies
Craig Chambers, Jeffrey Dean, David Grove
ICSE, Seattle, Washington (1995), pp. 221-230
-
Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis
Jeffrey Dean, David Grove, Craig Chambers
ECOOP (1995), pp. 77-101
-
Profile-Guided Receiver Class Prediction
David Grove, Jeffrey Dean, Charles Garrett, Craig Chambers
OOPSLA, Austin, TX (1995), pp. 108-123
-
Selective Specialization for Object-Oriented Languages
Jeffrey Dean, Craig Chambers, David Grove
PLDI, La Jolla, CA (1995), pp. 93-102
-
Identifying Profitable Specialization in Object-Oriented Languages
Jeffrey Dean, Craig Chambers, David Grove
Workshop on Partial Evaluation & Semantics-based Program Manipulation, Orlando, FL (1994), pp. 85-96
-
Towards Better Inlining Decisions Using Inlining Trials
Proceedings of the 1994 Conference on Lisp and Functional Programming (L&FP'94), Orlando, FL, pp. 273-282
-
Epi Info: A General-purpose Microcomputer Program for Public Health Information Systems
Andrew Dean, Jeffrey Dean, Anthony Burton, Richard Dicker
American Journal of Preventative Medicine, vol. 7 (1991), pp. 178-182
-
Software for Data Management and Analysis in Epidemiology
A. H. Burton, Jeffrey Dean, Andrew Dean
Journal of the World Health Forum, vol. 11, no. 1 (1990), pp. 75-77
