Jeffrey Dean

I joined Google in mid-1999, and I'm currently a Google Fellow in the Systems Infrastructure Group. My areas of interest include large-scale distributed systems, performance monitoring, compression techniques, information retrieval, application of machine learning to search and other related problems, microprocessor architecture, compiler optimizations, and development of new products that organize existing information in new and interesting ways. While at Google, I've worked on the following projects:
  • The design and implementation of the initial version of Google's advertising serving system.
  • The design and implementation of five generations of our crawling, indexing, and query serving systems, covering two and three orders of magnitude growth in number of documents searched, number of queries handled per second, and frequency of updates to the system. I recently gave a talk at WSDM'09 about some of the issues involved in building large-scale retrieval systems (slides).
  • The initial development of Google's AdSense for Content product (involving both the production serving system design and implementation as well as work on developing and improving the quality of ad selection based on the contents of pages).
  • The development of Protocol Buffers, a way of encoding structured data in an efficient yet extensible format, and a compiler that generates convenient wrappers for manipulating the objects in a variety of languages. Protocol Buffers are used extensively at Google for almost all RPC protocols, and for storing structured information in a variety of persistent storage systems. A version of the protocol buffer implementation has been open-sourced and is available at http://code.google.com/p/protobuf/.
  • Some of the initial production serving system work for the Google News product, working with Krishna Bharat to move the prototype system he put together into a deployed system. Some aspects of our search ranking algorithms, notably improved handling for dealing with off-page signals such as anchortext.
  • The design and implementation of the first generation of our automated job scheduling system for managing a cluster of machines.
  • The design and implementation of prototyping infrastructure for rapid development and experimentation with new ranking algorithms.
  • The design and implementation of MapReduce, a system for simplifying the development of large-scale data processing applications. A paper about MapReduce appeared in OSDI'04.
  • The design and implementation of BigTable, a large-scale semi-structured storage system used underneath a number of Google products. A paper about BigTable appeared in OSDI'06.
  • Some of the production system design for Google Translate, our statistical machine translation system. In particular, I designed and implemented a system for distributed high-speed access to very large language models (too large to fit in memory on a single machine).
  • Some internal tools to make it easy to rapidly search our internal source code repository. Many of the ideas from this internal tool were incorporated into our Google Code Search product, including the ability to use regular expressions for searching large corpora of source code.
I enjoy developing software with great colleagues, and I've been fortunate to have worked with many wonderful and talented people on all of my work here at Google. To help ensure that Google continues to hire people with excellent technical skills, I've also been fairly involved in our engineering hiring process.

I received a Ph.D. in Computer Science from the University of Washington, working with Craig Chambers on whole-program optimization techniques for object-oriented languages in 1996. I received a B.S., summa cum laude from the University of Minnesota in Computer Science & Economics in 1990. From 1996 to 1999, I worked for Digital Equipment Corporation's Western Research Lab in Palo Alto, where I worked on low-overhead profiling tools, design of profiling hardware for out-of-order microprocessors, and web-based information retrieval. From 1990 to 1991, I worked for the World Health Organization's Global Programme on AIDS, developing software to do statistical modelling, forecasting, and analysis of the HIV pandemic.

In 2009, I was elected to the National Academy of Engineering, and I was also named a Fellow of the Association for Computing Machinery (ACM).

Selected slides from talks:

Personal:

I've lived in lots of places in my life: Honolulu, HI; Manila, The Phillipines; Boston, MA; West Nile District, Uganda; Boston (again); Little Rock, AR; Hawaii (again); Minneapolis, MN; Mogadishu, Somalia; Atlanta, GA; Minneapolis (again); Geneva, Switzerland; Seattle, WA; and (currently) Palo Alto, CA. I'm hard-pressed to pick a favorite, though: each place has its plusses and minuses.

One of my life goals is to play soccer and basketball on every continent. So far, I've done so in North America, South America, Europe, Asia, and Africa. I'm worried that Antarctica might be tough, though.

Google Publications

Previous Publications

  •  

    A Comparison of Techniques to Find Mirrored Hosts on the WWW

    Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, Monika Rauch Henzinger

    IEEE Data Eng. Bull., vol. 23 (2000), pp. 21-26

  •   

    The Swift Java Compiler: Design and Implementation

    Daniel J. Scales, Keith H. Randall, Sanjay Ghemawat, Jeffrey Dean

    HP Labs Technical Reports (2000), pp. 26

  •  

    A Comparison of Techniques to Find Mirrored Hosts on the WWW

    Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, Monika Rauch Henzinger

    WOWS (1999), pp. 2-12

  •   

    Finding Related Pages in the World Wide Web

    Jeffrey Dean, Monika Rauch Henzinger

    Computer Networks, vol. 31 (1999), pp. 1467-1479

  •   

    Hardware Support for Out-of-Order Instruction Profiling on Alpha 21264a

    J. Anderson, L. Berc, Jeffrey Dean, Sanjay Ghemawat, S. Leung, M. Litchenberg, M Vandevoorde, G. Verns, C. Waldspurger, W. Weihl, J. White

    HOTCHIPS 99, IEEE (1999)

  •   

    Transparent, Low-Overhead Profiling on Modern Processors

    Jennifer Anderson, Lance Berc, George Chrysos, Jeffrey Dean, Sanjay Ghemawat, Jamey Hicks, Shun-tak Leung, mitch Lichtenberg, Mark Vendevoorde, Carl A. Waldspurger, William E. Weihl

    Workshop on Profile and Feedback-Directed Compilation, Paris (1998)

  •  

    ProfileMe: hardware support for instruction-level profiling on out-of-order processors

    Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Chrysos

    MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, IEEE Computer Society, Washington, DC, USA (1997), pp. 292-302

  •   

    Call Graph Construction in Object-Oriented Languages

    David Grove, Greg DeFouw, Jeffrey Dean, Craig Chambers

    OOPSLA (1997), pp. 108-124

  •   

    Continuous Profiling: Where Have All the Cycles Gone?

    Jennifer-Ann M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika Rauch Henzinger, Shun-Tak Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger, William E. Weihl

    ACM Transactions on Computer Systems, vol. 15 (1997), pp. 357-390

  •   

    ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors

    Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Chrysos

    Proc. 30th Annual Symposium on Microarchitecture (1997)

  •   

    Expressive, Efficient Instance Variables

    Jeffrey Dean, David Grove, Craig Chambers, Vassily Litvinov

    University of Washington (1996)

  •   

    Vortex: An Optimizing Compiler for Object-Oriented Languages

    Jeffrey Dean, Greg DeFouw, David Grove, Vassily Litvinov, Craig Chambers

    OOPSLA, San Jose, CA (1996), pp. 83-100

  •   

    Whole-program optimization of object-oriented languages

    Jeffrey Adgate Dean

    Ph.D. Thesis, University of Washington (1996)

  •   

    A Framework for Selective Recompilation in the Presence of Complex Intermodule Dependencies

    Craig Chambers, Jeffrey Dean, David Grove

    ICSE, Seattle, Washington (1995), pp. 221-230

  •   

    Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis

    Jeffrey Dean, David Grove, Craig Chambers

    ECOOP (1995), pp. 77-101

  •   

    Profile-Guided Receiver Class Prediction

    David Grove, Jeffrey Dean, Charles Garrett, Craig Chambers

    OOPSLA, Austin, TX (1995), pp. 108-123

  •   

    Selective Specialization for Object-Oriented Languages

    Jeffrey Dean, Craig Chambers, David Grove

    PLDI, La Jolla, CA (1995), pp. 93-102

  •   

    Identifying Profitable Specialization in Object-Oriented Languages

    Jeffrey Dean, Craig Chambers, David Grove

    Workshop on Partial Evaluation & Semantics-based Program Manipulation, Orlando, FL (1994), pp. 85-96

  •   

    Towards Better Inlining Decisions Using Inlining Trials

    Jeffrey Dean, Craig Chambers

    Proceedings of the 1994 Conference on Lisp and Functional Programming (L&FP'94), Orlando, FL, pp. 273-282

  •   

    Epi Info: A General-purpose Microcomputer Program for Public Health Information Systems

    Andrew Dean, Jeffrey Dean, Anthony Burton, Richard Dicker

    American Journal of Preventative Medicine, vol. 7 (1991), pp. 178-182

  •   

    Software for Data Management and Analysis in Epidemiology

    A. H. Burton, Jeffrey Dean, Andrew Dean

    Journal of the World Health Forum, vol. 11, no. 1 (1990), pp. 75-77