Underlying the many products and services offered by Google is a collection of
systems and tools that simplify the storage and processing of large-scale data
sets. These systems are intended to work well in Google's computational environment
of large numbers of commodity machines connected by commodity networking hardware.
Our systems handle issues like storage reliability and availability in the face of
machine failures, and our processing tools make it relatively easy to write robust
computations that run reliably and efficiently on thousands of machines. In this
talk I'll highlight some of the systems we have built and are currently developing,
and discuss some challenges and future directions for new systems.