Distributed Systems and Parallel Computing

No matter how powerful individual computers become, there are still reasons to harness the power of multiple computational units, often spread across large geographic areas. Sometimes this is motivated by the need to collect data from widely dispersed locations (e.g., web pages from servers, or sensors for weather or traffic). Other times it is motivated by the need to perform enormous computations that simply cannot be done by a single CPU.

From our company’s beginning, Google has had to deal with both issues in our pursuit of organizing the world’s information and making it universally accessible and useful. We continue to face many exciting distributed systems and parallel computing challenges in areas such as concurrency control, fault tolerance, algorithmic efficiency, and communication. Some of our research involves answering fundamental theoretical questions, while other researchers and engineers are engaged in the construction of systems to operate at the largest possible scale, thanks to our hybrid research model.

Recent Publications

Load is not what you should balance: Introducing Prequal
Bobby Kleinberg
Bartek Wydrowski
Steve Rumble
(2024)
Federated Variational Inference: Towards Improved Personalization and Generalization
Elahe Vedadi
Karan Singhal
Arash Afkanpour
Warren Morningstar
Philip Mansfield
Josh Dillon
AAAI Federated Learning on the Edge Symposium (2024)
Thesios: Synthesizing Accurate Counterfactual I/O Traces from I/O Samples
Soroush Ghodrati
Mangpo Phothilimthana
Selene Moon
ASPLOS 2024, Association for Computing Machinery
BigLake: BigQuery’s Evolution toward a Multi-Cloud Lakehouse
Anoop Johnson
Justin Levandoski
Prem Ramanathan
Garrett Casto
Yuri Volobuev
Vidya Shanmugam
Dawid Kurzyniec
Jeff Johnson
Thibaud Hottelier
Amir Hormati
Gaurav Saxena
Mingge Deng
Rushabh Desai
SIGMOD (2024)