Distributed Systems and Parallel Computing
No matter how powerful individual computers become, there are still reasons to harness the power of multiple computational units, often spread across large geographic areas. Sometimes this is motivated by the need to collect data from widely dispersed locations (e.g., web pages from servers, or sensors for weather or traffic). Other times it is motivated by the need to perform enormous computations that simply cannot be done by a single CPU.
From the beginning Google has had to deal with both issues in our pursuit of organizing the world’s information and making it universally accessible and useful. We continue to face many exciting distributed systems and parallel computing challenges in areas such as concurrency control, fault tolerance, algorithmic efficiency, and communication. Some of our research involves answering fundamental theoretical questions, while other researchers and engineers are engaged in the construction of systems to operate at the largest possible scale, thanks to our hybrid research model.
174 Publications
-
Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, John Wilkes
ACM Queue, vol. 14 (2016), pp. 70-93
-
Design patterns for container-based distributed systems
Brendan Burns, David Oppenheimer
The 8th Usenix Workshop on Hot Topics in Cloud Computing (HotCloud '16) (2016)
-
DieHard: reliable scheduling to survive correlated failures in cloud data centers
Mina Sedaghat, Eddie Wadbro, John Wilkes, Sara De Luna, Oleg Seleznjev, Erik Elmroth
International Symposium on Cluster, Cloud and Grid Computing (CCGrid), IEEE/ACM, Cartagena, Colombia (2016), pp. 52-59
-
Eric Brewer, Lawrence Ying, Lawrence Greenfield, Robert Cypher, Theodore T'so
Google (2016), pp. 1-16
-
Distributed Balanced Partitioning via Linear Embedding
Kevin Aydin, Mohammadhossein Bateni, Vahab Mirrokni
WSDM 2016: Ninth ACM International Conference on Web Search and Data Mining, ACM (to appear)
-
Improving Resource Efficiency at Scale with Heracles
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, Christos Kozyrakis
ACM Transactions on Computer Systems (TOCS), vol. 34 (2016), 6:1-6:33
-
Maglev: A Fast and Reliable Software Network Load Balancer
Daniel E. Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, Jinnah Dylan Hosein
13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), USENIX Association, Santa Clara, CA (2016), pp. 523-535
-
Modular Composition of Coordination Services
Kfir Lev-Ari, Edward Bortnikov, Idit Keidar, Alexander Shraer
USENIX Annual Technical Conference (ATC) (2016)
-
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Andrew Newell, Gabriel Kliot, Ishai Menache, Aditya Gopalan, Soramichi Akiyama, Mark Silberstein
EuroSys 2016, ACM – Association for Computing Machinery (to appear)
-
Revisiting Distributed Synchronous SGD
Jianmin Chen, Rajat Monga, Samy Bengio, Rafal Jozefowicz
International Conference on Learning Representations Workshop Track (2016) (to appear)
-
Robust Large-Scale Machine Learning in the Cloud
Steffen Rendle, Dennis Fetterly, Eugene J. Shekita, Bor-yiing Su
Proceedings of the 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Francisco, CA, USA (2016) (to appear)
-
Robust and Probabilistic Failure-Aware Placements
Madhukar Korupolu, Rajmohan Rajaraman
ACM Symposium on Parallel Algorithms and Architectures (SPAA), California, USA (2016)
-
PASC16, EPFL, Lausanne, Switzerland (2016)
-
TensorFlow: A system for large-scale machine learning
Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zheng
Google Brain (2016)
-
Can Traditional Programming Bridge the Ninja Performance Gap for Parallel Computing Applications?
Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, Rakesh Krishnaiyer, Mikhail Smelyanskiy, Milind Girkar, Pradeep Dubey
Communications of the ACM, vol. 58 (2015), pp. 77-86
-
Computing weak consistency in polynomial time
Wojciech Golab, Xiaozhou (Steve) Li, Alejandro López-Ortiz, Naomi Nishimura
Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, ACM, New York, NY, USA, pp. 395-404
-
Continuous Pipelines at Google
SRECon Europe 2015, USENIX, Dublin, Ireland, pp. 12
-
Dynamic iSCSI at Scale: Remote Paging at Google
Linux Plumbers Conference 2015
-
SIAM Journal on Scientific Computing, vol. 37(1) (2015)
-
Federated Optimization: Distributed Optimization Beyond the Datacenter
Jakub Konečný, H. Brendan McMahan, Daniel Ramage
NIPS Optimization for Machine Learning Workshop (2015), pp. 5
-
Heracles: Improving Resource Efficiency at Scale
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, Christos Kozyrakis
Proceedings of the 42th Annual International Symposium on Computer Architecture (2015)
-
High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads
Workshop on Business Intelligence for the Real Time Enterprise (BIRTE), Springer (2015) (to appear)
-
Kubernetes - Scheduling the Future at Cloud Scale
O'Reilly and Associates, 1005 Gravenstein Highway North Sebastopol, CA 95472, All
-
Large-scale cluster management at Google with Borg
Abhishek Verma, Luis Pedrosa, Madhukar R. Korupolu, David Oppenheimer, Eric Tune, John Wilkes
Proceedings of the European Conference on Computer Systems (EuroSys), ACM, Bordeaux, France (2015)
-
Poster Paper: Automatic Reconfiguration of Distributed Storage
Artyom Sharov, Alexander Shraer, Arif Merchant, Murray Stokely
The 12th International Conference on Autonomic Computing, IEEE (2015), pp. 133-134
-
RFC7535 - AS112 Redirection Using DNAME
Warren Kumari, Joe Abley, Brian Dickson, George Michaelson
IETF RFCs, Internet Engineering Task Force (2015), pp. 16
-
RFC7706 - Decreasing Access Time to Root Servers by Running One on Loopback
Warren Kumari, Paul Hoffman
IETF RFCs, Internet Engineering Task Force (2015), pp. 12
-
RSSAC003 - RSSAC Report on Root Zone TTLs
ICANN Root Server System Advisory Committee ( RSSAC ) Reports and Advisories, Internet Corporation for Assigned Names and Numbers (ICANN) (2015), pp. 35
-
Randomized Composable Core-sets for Distributed Submodular Maximization
Vahab S. Mirrokni, Morteza Zadimoghaddam
STOC (2015), pp. 153-162
-
Randomized Composable Core-sets for Distributed Submodular Maximization
Vahab S. Mirrokni, Morteza Zadimoghaddam
CoRR, vol. abs/1506.06715 (2015)
-
Take me to your leader! Online Optimization of Distributed Storage Configurations
Artyom Sharov, Alexander Shraer, Arif Merchant, Murray Stokely
Proceedings of the 41st International Conference on Very Large Data Bases, VLDB Endowment (2015), pp. 1490-1501
-
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, Xiaoqiang Zheng
tensorflow.org (2015)
-
Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J. Fernández-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, Frances Perry, Eric Schmidt, Sam Whittle
Proceedings of the VLDB Endowment, vol. 8 (2015), pp. 1792-1803
-
The rise of cloud computing systems
SOSP History Day (2015), 12:1-12:40
-
Timely Dataflow: A Model
FORTE (2015), pp. 131-145
-
Tunable Performance and Consistency Tradeoffs for Geographically Replicated Cloud Services (COLOR)
Wenbo Zhu, C. Murray Woodside
Cyber Security and Cloud Computing (CSCloud), 2015 IEEE 2nd International Conference on, IEEE, pp. 457-463
-
Author Retrospective for A NUCA Substrate for Flexible CMP Cache Sharing
Jaehyuk Huh, Changkyu Kim, Hazim Shafi, Lixin Zhang, Doug Burger, Stephen W. Keckler
ICS 25th Anniversary Volume, ACM SIGARCH (2014)
-
IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2014), pp. 458-467
-
Connected Components in MapReduce and Beyond
Raimondas Kiveris, Silvio Lattanzi, Vahab Mirrokni, Vibhor Rastogi, Sergei Vassilvitskii
SOCC 2014
-
Coupled and k-Sided Placements: Generalizing Generalized Assignment
Madhukar Korupolu, Adam Meyerson, Rajmohan Rajaraman, Brian Tagiku
Integer Programming and Combinatorial Optimization (IPCO) (2014)
-
Diff-Index: Differentiated Index in Distributed Log-Structured Data Stores
Wei Tan, Sandeep Tata, Yuzhe Tang, Liana Fong
EDBT (2014) (to appear)
-
Distributed Balanced Clustering via Mapping Coresets
Mohammadhossein Bateni, Aditya Bhaskara, Silvio Lattanzi, Vahab Mirrokni
NIPS, Neural Information Processing Systems Foundation (2014)
-
Evaluating job packing in warehouse-scale computing
Abhishek Verma, Madhukar Korupolu, John Wilkes
IEEE Cluster, Madrid, Spain (2014)
-
Eventually consistent: Not what you were expecting?
Wojciech Golab, Muntasir R. Rahman, Alvin AuYoung, Kimberly Keeton, Xiaozhou (Steve) Li
Communications of the ACM, vol. 57, no. 3 (2014), pp. 38-44
-
Charles Johnson, Kimberly Keeton, Charles B. Morrey III, Craig A. N. Soules, Alistair Veitch, Stephen Bacon, Oskar Batuner, Marcelo Condotta, Hamilton Coutinho, Patrick J. Doyle, Rafael Eichelberger, Hugo Kiehl, Guilherme Magalhaes, James McEvoy, Padmanabhan Nagarajan, Patrick Osborne, Joaquim Souza, Andy Sparkes, Mike Spitzer, Sebastien Tandel, Lincoln Thomas, Sebastian Zangaro
Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 2014), USENIX
-
Long-term SLOs for reclaimed cloud computing resources
Marcus Carvalho, Walfredo Cirne, Franciso Brasileiro, John Wilkes
ACM Symposium on Cloud Computing (SoCC), ACM, Seattle, WA, USA (2014), 20:1-20:13
-
Low-Overhead Network-on-Chip Support for Location-Oblivious Task Placement
Gwangsun Kim, Lee, M.M.-J., John Kim, Dennis Abts, Michael R. Marty
IEEE Transactions on Computers, vol. Volume 63, Issue 6 (2014), pp. 1487 - 1500
-
MPIDepQBF: Towards Parallel QBF Solving without Knowledge Sharing
Charles Jordan, Lukasz Kaiser, Florian Lonsing, Martina Seidl
SAT (2014), pp. 430-437
-
Macaroons: Cookies with Contextual Caveats for Decentralized Authorization in the Cloud
Arnar Birgisson, Joe Gibbs Politz, Úlfar Erlingsson, Ankur Taly, Michael Vrable, Mark Lentczner
Network and Distributed System Security Symposium, Internet Society (2014)
-
Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing
Ashish Gupta, Fan Yang, Jason Govig, Adam Kirsch, Kelvin Chan, Kevin Lai, Shuo Wu, Sandeep Dhoot, Abhilash Kumar, Ankur Agiwal, Sanjay Bhansali, Mingsheng Hong, Jamie Cameron, Masood Siddiqi, David Jones, Jeff Shute, Andrey Gubarev, Shivakumar Venkataraman, Divyakant Agrawal
VLDB (2014)
-
Near-Data Processing: Insights from a MICRO-46 Workshop
Rajeev Balasubramonian, Jichuan Chang, Troy Manning, Jaime H. Moreno, Richard Murphy, Ravi Nair, Steven Swanson
IEEE Micro (Special Issue on Big Data), vol. 34 (2014), pp. 36-43
-
Software Defined Networking at Scale
Light Reading (2014), pp. 22
-
TRAM: Optimizing Fine-grained Communication with Topological Routing and Aggregation of Messages
Lukasz Wesolowski, Ramprasad Venkataraman, A Gupta, Jae-Seung Yeom, Keith Bisset, Yanhua Sun, Pritish Jetley, Thomas Quinn, Laxmikant Kale
International Conference on Parallel Processing (2014)
-
The wisdom of clouds
Chemistry World, vol. 11 (2014), pp. 38
-
AGILE: elastic distributed resource scaling for Infrastructure-as-a-Service
Hiep Nguyen, Zhiming Shen, Xiaohui Gu, Sethuraman Subbiah, John Wilkes
10th International Conference on Autonomic Computing (ICAC), USENIX, San Jose, CA, USA (2013), pp. 69-82
-
Brief Announcement: Consistency and Complexity Tradeoffs for Highly-Available Multi-Cloud Store
Gregory Chockler, Dan Dobre, Alexander Shraer
The International Symposium on Distributed Computing (DISC) (2013)
-
Ensuring Connectivity via Data Plane Mechanisms
10th USENIX Symposium on Networked Systems Design and Implementation (2013)
-
EventWave: Programming Model and Runtime Support for Tightly-Coupled Elastic Cloud Applications
Wei-Chiu Chuang, Bo Sang, Sunghwan Yoo, Rui Gu, Charles Killian, Milind Kulkarni
Proceedings of the 2013 ACM Symposium on Cloud Computing, ACM, Santa Clara, CA, USA
-
F1: A Distributed SQL Database That Scales
Jeff Shute, Radek Vingralek, Bart Samwel, Ben Handy, Chad Whipkey, Eric Rollins, Mircea Oancea, Kyle Littlefield, David Menestrina, Stephan Ellner, John Cieslewicz, Ian Rae, Traian Stancescu, Himani Apte
VLDB (2013)
-
Fast Data Processing with Spark
Packt (2013)
-
MillWheel: Fault-Tolerant Stream Processing at Internet Scale
Tyler Akidau, Alex Balikov, Kaya Bekiroglu, Slava Chernyak, Josh Haberman, Reuven Lax, Sam McVeety, Daniel Mills, Paul Nordstrom, Sam Whittle
Very Large Data Bases (2013), pp. 734-746
-
Minimizing weighted flowtime on capacitated machines
Kyle Fox, Madhukar Korupolu
ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013)
-
Omega: flexible, scalable schedulers for large compute clusters
Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, John Wilkes
SIGOPS European Conference on Computer Systems (EuroSys), ACM, Prague, Czech Republic (2013), pp. 351-364
-
On the k-atomicity-verification problem
Wojciech Golab, Jeremy Hurwitz, Xiaozhou Li
The 33rd International Conference on Distributed Computing Systems, IEEE (2013)
-
Online, Asynchronous Schema Change in F1
Ian Rae, Eric Rollins, Jeff Shute, Sukhdeep Sodhi, Radek Vingralek
VLDB (2013)
-
Optical Interconnects for Scale-Out Data Centers
Hong Liu, Ryohei Urata, Amin Vahdat
Optical Interconnects for Future Data Center Networks, Springer, Avenel, NJ (2013), pp. 17-31
-
Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams
Rajagopal Ananthanarayanan, Venkatesh Basker, Sumit Das, Ashish Gupta, Haifeng Jiang, Tianhao Qiu, Alexey Reznichenko, Deomid Ryabkov, Manpreet Singh, Shivakumar Venkataraman
SIGMOD '13: Proceedings of the 2013 international conference on Management of data, ACM, New York, NY, USA, pp. 577-588
-
Spanner: Google's Globally Distributed Database
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson C. Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, Dale Woodford
ACM Trans. Comput. Syst., vol. 31 (2013), pp. 8
-
Luiz André Barroso, Jimmy Clidaras, Urs Hölzle
Morgan & Claypool Publishers (2013)
-
Jeffrey Dean, Luiz André Barroso
Communications of the ACM, vol. 56 (2013), pp. 74-80
-
Verifying Cloud Services: Present and Future
Sara Bouchenak, Gregory Chockler, Hana Chockler, Gabriela Gheorghe, Nuno Santos, Alexander Shraer
Operating Systems Review (2013)
-
Walfredo Cirne, Eitan Frachtenberg
Lecture Notes in Computer Science, vol. 7698 (2013)
-
CPI^2: CPU performance isolation for shared compute clusters
Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, John Wilkes
SIGOPS European Conference on Computer Systems (EuroSys), ACM, Prague, Czech Republic (2013), pp. 379-391
-
A Guided Tour of Datacenter Networking
Communications of the ACM - ACM Queue, vol. 55, number 6 (2012), pp. 44-51
-
Achieving Rapid Response Times in Large Online Services
Talk given at Berkeley AMPLab Cloud Seminar, March 26, 2012 (2012)
-
An approach to Distributed Virtual Environment performance modeling: Addressing system complexity and user behavior
H. Lally Singh, Denis Gracanin
Proceedings of the 2012 IEEE Virtual Reality, IEEE Computer Society, Washington, DC, USA, pp. 71-72
-
Characterization and Comparison of Cloud versus Grid Workloads
Sheng Di, Derrick Kondo, Walfredo Cirne
IEEE Cluster 2012
-
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster
Changkyu Kim, Jongsoo Park, Nadathur Satish, Hongrae Lee, Pradeep Dubey, Jatin Chhugani
Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, USA, pp. 841-850
-
DIPLOMA: Consistent and Coherent Shared Memory over Mobile Phones
30th IEEE International Conference on Computer Design (2012)
-
F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business
Jeff Shute, Mircea Oancea, Stephan Ellner, Ben Handy, Eric Rollins, Bart Samwel, Radek Vingralek, Chad Whipkey, Xin Chen, Beat Jegerlehner, Kyle Littlefield, Phoenix Tong
SIGMOD (2012)
-
Finding Connected Components in Map-reduce in Logarithmic Rounds
Vibhor Rastogi, Ashwin Machanavajjhala, Laukik Chitnis, Anish Das Sarma
ICDE, IEE (2012) (to appear)
-
Hostload prediction in a Google compute cloud with a Bayesian model
Sheng Di, Derrick Kondo, Walfredo Cirne
Supercomputing 2012
-
JANUS: exploiting parallelism via hindsight
Omer Tripp, Roman Manevich, John Field, Mooly Sagiv
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, ACM, New York, NY, USA (2012), pp. 145-156
-
Obfuscatory obscanturism: making workload traces of commercially-sensitive systems safe to release
Charles Reiss, John Wilkes, Joseph L. Hellerstein
CloudMAN, IEEE, Maui, HI, USA (2012)
-
Optimistic Scheduling with Geographically Replicated Services in the Cloud Environment (COLOR)
Wenbo Zhu, C. Murray Woodside
Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on, IEEE CONFERENCE PUBLICATIONS, pp. 735-740
-
Orchestrating the deployment of computations in the cloud with conductor
Alexander Wieder, Pramod Bhatotia, Ansley Post, Rodrigo Rodrigues
Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association, Berkeley, CA, USA (2012), pp. 27-27
-
Overlapping clusters for distributed computation
Reid Andersen, David Gleich, Vahab Mirrokni
ACM Conference on Web Search and Data Mining (WSDM) (2012)
-
Processing a Trillion Cells per Mouse Click
Alex Hall, Olaf Bachmann, Robert Buessow, Silviu-Ionut Ganceanu, Marc Nunkesser
PVLDB, vol. 5 (2012), pp. 1436-1446
-
Projecting Disk Usage Based on Historical Trends in a Cloud Environment
Murray Stokely, Amaan Mehrabian, Christoph Albrecht, Francois Labelle, Arif Merchant
ScienceCloud 2012 Proceedings of the 3rd International Workshop on Scientific Cloud Computing, ACM, pp. 63-70
-
Recursion in Scalable Protocols via Distributed Data Flows
Languages for Distributed Algorithms (2012) (to appear)
-
Resource-bounded multicore emulation using Beefarm
Oriol Arcas, Nehir Sonmez, Gokhan Sayilar, Satnam Singh, Osman S. Unsal, Adrian Cristal, Ibrahim Hur, Mateo Valero
Microprocessors and Microsystems (2012)
-
Joseph L Hellerstein, Kai Kohlhoff, David E. Konerding
IEEE Internet Computing (2012)
-
Spanner: Google's Globally-Distributed Database
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Dale Woodford, Yasushi Saito, Christopher Taylor, Michal Szymaniak, Ruth Wang
OSDI (2012)
-
Trickle: Rate Limiting YouTube Video Streaming
Monia Ghobadi, Yuchung Cheng, Ankur Jain, Matt Mathis
Proceedings of the USENIX Annual Technical Conference (2012), pp. 6
-
Uncertainty in Aggregate Estimates from Sampled Distributed Traces
Nate Coehlo, Arif Merchant, Murray Stokely
2012 Workshop on Managing Systems Automatically and Dynamically, USENIX
-
Upper and Lower Bounds on the Cost of a Map-Reduce Computation
Foto Afrati, Anish Das Sarma, Semih Salihoglu, Jeffrey Ullman
Arxiv (2012)
-
Vision Paper: Towards an Understanding of the Limits of Map-Reduce Computation
Foto Afrati, Anish Das Sarma, Semih Salihoglu, Jeffrey Ullman
CloudFutures Workshop (2012)
-
A Tight Unconditional Lower Bound on Distributed Random Walk Computation
Danupon Nanongkai, Atish Das Sarma, Gopal Pandurangan
ACM Symposium on Principles of Distributed Computing (PODC) (2011)
-
Characterizing Task Usage Shapes in Google Compute Clusters
Qi Zhang, Joseph Hellerstein, Raouf Boutaba
Proceedings of the 5th International Workshop on Large Scale Distributed Systems and Middleware (2011)
-
CloudScale: elastic resource scaling for multi-tenant cloud systems
Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes
Symposium on Cloud Computing (SoCC), ACM, Cascais, Portugal (2011)
-
Design and Implementation of FAITH, an Experimental System to Intercept and Manipulate Online Social Informatics
Ruaylong Lee, Roozbeh Nia, Jason Hsu, Karl N. Levitt, Jeff Rowe, S. Felix Wu, Shaozhi Ye
International Conference on Advances in Social Networks Analysis and Mining, IEEE (2011), pp. 195-202
-
Diagnosing Latency in Multi-Tier Black-Box Services
Krzysztof Ostrowski, Gideon Mann, Mark Sandler
5th Workshop on Large Scale Distributed Systems and Middleware (LADIS 2011) (to appear)
-
Exploiting Service Usage Information for Optimizing Server Resource Management
Alexander Totok, Vijay Karamcheti
ACM Transactions on Internet Technology (TOIT), vol. 11 (2011), pp. 1-26
-
FAWN: a fast array of wimpy nodes: technical perspective
Communications of the ACM, vol. 54 (2011), pp. 100-100
-
HTAF: Hybrid Testing Automation Framework to Leverage Local and Global Computing Resources
Keun Soo Yim, David Hreczany, Ravishankar K. Iyer
Lecture Notes in Computer Science, vol. 6784 (2011), pp. 479-494
-
Megastore: Providing Scalable, Highly Available Storage for Interactive Services
Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, Vadim Yushprakh
Proceedings of the Conference on Innovative Data system Research (CIDR) (2011), pp. 223-234
-
Modeling and Synthesizing Task Placement Constraints in Google Compute Clusters
Victor Chudnovsky, Rasekh Rifaat, Joseph Hellerstein, Bikash Sharma, Chita Das
Symposium on Cloud Computing, ACM (2011)
-
Modeling the Parallel Execution of Black-Box Services
Gideon Mann, Mark Sandler, Darja Krushevskaja, Sudipto Guha, Eyal Even-Dar
HotCloud, Usenix (2011)
-
Perspectives on cloud computing: interviews with five leading scientists from the cloud community
Gordon Blair, Fabio Kon, Walfredo Cirne, Dejan Milojicic, Raghu Ramakrishnan, Dan Reed, Dilma Silva
Journal of Internet Services and Applications (2011)
-
PowerNap: An Energy Efficient MAC Layer for Random Routing in Wireless Sensor Networks
Onur Soysal, Sami Ayyorgun, Murat Demirbas
IEEE SECON 2011
-
Tenzing A SQL Implementation On The MapReduce Framework
Biswapesh Chattopadhyay, Liang Lin, Weiran Liu, Sagar Mittal, Prathyusha Aragonda, Vera Lychagina, Younghee Kwon, Michael Wong
Proceedings of VLDB, VLDB Endowment (2011), pp. 1318-1327
-
The Emerging Optical Data Center
Amin Vahdat, Hong Liu, Xiaoxue Zhao, Chris Johnson
OFC 2011, OTuH2
-
Thialfi: A Client Notification Service for Internet-Scale Applications
Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek
Proc. 23rd ACM Symposium on Operating Systems Principles (SOSP) (2011), pp. 129-142
-
Warehouse-scale Computing: entering the teenage decade
Association for Computing Machinery (2011)
-
Analyzing and enhancing the parallel sort operation on multithreaded architectures
Layali K. Rashid, Wessam Hassanein, Moustafa A. Hammad
The Journal of Supercomputing, vol. 53 (2010), pp. 293-312
-
Anti-Omega: the weakest failure detector for set agreement
Distributed Computing, vol. 22 (2010), pp. 335-348
-
Availability in Globally Distributed Storage Systems
Daniel Ford, Francois Labelle, Florentina Popovici, Murray Stokely, Van-Anh Truong, Luiz Barroso, Carrie Grimes, Sean Quinlan
Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, USENIX (2010)
-
Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
Benjamin H. Sigelman, Luiz André Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, Chandan Shanbhag
Google, Inc. (2010)
-
Luiz André Barroso, Parthasarathy Ranganathan
IEEE Micro, vol. 30 (2010), pp. 6-7
-
Dremel: Interactive Analysis of Web-Scale Datasets
Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis
Proc. of the 36th Int'l Conf on Very Large Data Bases (2010), pp. 330-339
-
Evolution and Future Directions of Large-scale Storage and Computation Systems at Google
Keynote talk given at 1st Symposium on Cloud Computing (SOCC), ACM, pp. 1-1
-
FlumeJava: Easy, Efficient Data-Parallel Pipelines
Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert Henry, Robert Bradshaw, Nathan
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), ACM New York, NY 2010, 2 Penn Plaza, Suite 701 New York, NY 10121-0701 (2010), pp. 363-375
-
Large-scale Incremental Processing Using Distributed Transactions and Notifications
Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, USENIX (2010)
-
Robin Anil, Sean Owen, Ted Dunning, Ellen Friedman
Manning, Manning Publications Co. Sound View Ct. #3B Greenwich, CT 06830 (2010), pp. 350
-
MapReduce: a flexible data processing tool
Commun. ACM, vol. 53 (2010), pp. 72-77
-
Optimizing Utilization of Resource Pools in Web Application Servers
Alexander Totok, Vijay Karamcheti
Concurrency and Computation: Practice and Experience, vol. 22 (2010), pp. 2421-2444
-
PRESS: PRedictive Elastic ReSource Scaling for cloud systems
Zhenhuan Gong, Xiaohui Gu, John Wilkes
6th IEEE/IFIP International Conference on Network and Service Management (CNSM 2010), Niagara Falls, Canada
-
Warehouse Scale Computing - A keynote address to SIGMOD'10
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (2010)
-
A unified format for traces of peer-to-peer systems
Boxun Zhang, Alexandru Iosup, Pawel Garbacki, Johan Pouwelse
LSAP '09: Proceedings of the 1st ACM workshop on Large-Scale system and application performance, ACM, New York, NY, USA (2009), pp. 27-34
-
Causeway: a message-oriented distributed debugger
Terry Stanley, Tyler Close, Mark S. Miller
HP Labs (2009)
-
Do you know your IQ? A research agenda for information quality in systems
Kimberley Keeton, HP Labs, Pankaj Mehra, HP Labs, John Wilkes
HotMETRICS'09 (2009)
-
Machine Learning-Based Prefetch Optimization for Data Center Applications
Shih-wei Liao, Tzu-Han Hung, Donald Nguyen, Chinyen Chou, Chiaheng Tu, Hucheng Zhou
Proceedings of Supercomputing (2009)
-
MapReduce: The programming model and practice
Jerry Zhao, Jelena Pjesivac-Grbovic
SIGMETRICS (2009)
-
Parallel algorithms for mining large-scale rich-media data
Edward Y. Chang, Hongjie Bai, Kaihua Zhu
MM '09: Proceedings of the seventeen ACM international conference on Multimedia, ACM, New York, NY, USA (2009), pp. 917-918
-
Prefetch optimizations on large-scale applications via parameter value prediction
Shih-wei Liao, Tzu-Han Hung, Donald Nguyen, Hucheng Zhou, Chinyen Chou, Chiaheng Tu
ICS (2009), pp. 519-520
-
Pregel: A System for Large-Scale Graph Processing
Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski
28th ACM Symposium on Principles of Distributed Computing (2009), pp. 6-6
-
The Best of CCGrid'2007: A Snapshot of an 'Adolescent' Area
Walfredo Cirne, Bruno Schulze
Concurrency and Computation: Practice and Experience, vol. 21 (2009)
-
Using a Market Economy to Provision Compute Resources Across Planet-wide Clusters
Murray Stokely, Jim Winget, Ed Keyes, Carrie Grimes, Benjamin Yolken
Proceedings for the International Parallel and Distributed Processing Symposium 2009, IEEE, pp. 1-8
-
Why Locally-Fair Maximal Flows in Client-Server Networks Perform Well
Chad Yoshikawa, Ken Berman
Computing and Combinatorics, Springer Berlin Heidelberg, 12715 NE 81st PL (2009), pp. 368-377
-
Anti-Omega: the weakest failure detector for set agreement
27th ACM Symposium on Principles of Distributed Computing (PODC 2008)
-
Enhancing Community Authorization Services
Kumar Abhishek, Kumar Kapil
16th Euromicro International Conference on Parallel, Distributed and network-based Processing, IEEE Computer Society (2008) (to appear)
-
Extending IC-Scheduling via the Sweep Algorithm
Gennaro Cordasco, Grzegorz Malewicz, Arnold L. Rosenberg
16th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (2008), pp. 366-373
-
MapReduce: Simplified Data Processing on Large Clusters
Communications of the ACM, vol. 51, no. 1 (2008), pp. 107-113
-
Yangqiu Song, Wen-Yen Chen, Hongjie Bai, Chih-Jen Lin, Edward Chang
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), Springer (2008), pp. 374-389
-
Physics Aware Programming Paradigm: Approach and Evaluation
Salim Hariri, Yaser Jararweh, Yeliang Zhang, Talal Moukabary
Proc. 6th International Workshop on Challenges of Large Applications in Distributed Environments, ACM, Boston (2008), pp. 1-6
-
RaWMS - Random Walk based Lightweight Membership Service for Wireless Ad Hoc Networks
Ziv Bar-Yossef, Roy Friedman, Gabi Kliot
ACM Transactions on Computer Systems, vol. 26 (2008), pp. 1-66
-
Age-based Packet Arbitration in Large k-ary n-cubes
SC (2007)
-
Applying IC-Scheduling Theory to Familiar Classes of Computations
Gennaro Cordasco, Grzegorz Malewicz, Arnold L. Rosenberg
Workshop on Large-Scale and Volatile Desktop Grids in conjunction with IPDPS'07 (2007), pp. 1-8
-
Architect's dream or developer's nightmare?
Gregor Hohpe
Proc. 2007 inaugural international conference on distributed event-based systems, ACM, Toronto, pp. 188-188
-
Distributed Programming with MapReduce
Beautiful Code, O'Reilly (2007), Chapter 23
-
Engineering Reliability into Web Sites: Google SRE
Alexander R. Perry
Proceedings of LinuxWorld (2007)
-
Large Language Models in Machine Translation
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858-867
-
Let's Have a Conversation
Gregor Hohpe
IEEE Internet Computing, vol. 11, no. 3 (2007), pp. 78-81
-
MRPSO: MapReduce Particle Swarm Optimization
Andrew W. McNabb, Christopher K. Monson, Kevin D. Seppi
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2007), IEEE
-
Parallel Approximate Matrix Factorization for Kernel Methods
Kaihua Zhu, Hang Cui, Hongjie Bai, Jian Li, Zhihuan Qiu, Hao Wang, Hui Xu, Edward Y. Chang
IEEE International Conference on Multimedia and Expo(ICME) (2007)
-
Parallel PSO Using MapReduce
Andrew W. McNabb, Christopher K. Monson, Kevin D. Seppi
Proceedings of the IEEE Congress on Evolutionary Computation, IEEE Press (2007), pp. 7-14
-
Parallelizing Support Vector Machines on Distributed Computers
Edward Y. Chang, Kaihua Zhu, Hao Wang, Hongjie Bai, Jian Li, Zhihuan Qiu, Hang Cui
Neural Information Processing Systems (NIPS) (2007)
-
Paxos Made Live - An Engineering Perspective (2006 Invited Talk)
Tushar Deepak Chandra, Robert Griesemer, Joshua Redstone
Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing, ACM press (2007)
-
A Tool for Prioritizing DAGMan Jobs and Its Evaluation
Grzegorz Malewicz, Ian Foster, Arnold Rosenberg, Michael Wilde
Proceedings of the IEEE International Symposium on High-Performance Distributed Computing (HPDC06), Paris, France (2006), pp. 156-167
-
An Autonomic Routing Framework for Sensor Networks
Yu He, Cauligi S. Raghavendra, Steven Berson, Robert Braden
Cluster Computing, Special Issue on Autonomic Computing (Kluwer Academic Pulishers), vol. 9 (2006), pp. 191-200
-
An Experimental Study of the Skype Peer-to-Peer VoIP System
Saikat Guha, Neil Daswani, Ravi Jain
Proceedings of The 5th International Workshop on Peer-to-Peer Systems (IPTPS '06), Santa Barbara, CA (2006)
-
Bigtable: A Distributed Storage System for Structured Data
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber
7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), {USENIX} (2006), pp. 205-218
-
Data Management for Internet-Scale Single-Sign-On
Sharon E. Perl, Margo Seltzer
Proceedings of the 3rd Workshop on Real, Large Distributed Systems, Usenix (2006)
-
Experiences with MapReduce, an abstraction for large-scale computation
Proc. 15th International Conference on Parallel Architectures and Compilation Techniques, ACM, Seattle, WA (2006), pp. 1
-
Java Concurrency in Practice
Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes, Doug Lea
Addison-Wesley, Boston, MA (2006)
-
Minimizing the Stretch when Scheduling Flows of Biological Requests
Arnaud Legrand, Alan Su, Frédéric Vivien
Proceedings of the 18th ACM Symposium on Parallelism in Algorithms and Architectures (2006)
-
On Scheduling Expansive and Reductive Dags for Internet-Based Computing
Gennaro Cordasco, Grzegorz Malewicz, Arnold L. Rosenberg
26th IEEE International Conference on Distributed Computing Systems (2006), pp. 29
-
Simple Efficient Load-Balancing Algorithms for Peer-to-Peer Systems
David R. Karger, Matthias Ruhl
Theory of Computing Systems, vol. 39, no. 6 (2006), pp. 787-804
-
The Chubby lock service for loosely-coupled distributed systems
7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), {USENIX} (2006)
-
Decentralized algorithms using both local and random probes for P2P load balancing
Krishnaram Kenthapadi, Gurmeet Singh Manku
SPAA 2005 (17th ACM Symposium on Parallelism in Algorithms an Architectures), pp. 135-144
-
Interpreting the Data: Parallel Analysis with Sawzall
Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan
Scientific Programming Journal, vol. 13 (2005), pp. 277-298
-
Papillon: Greedy Routing in Rings
Ittai Abraham, Dahlia Malkhi, Gurmeet Singh Manku
DISC (2005), pp. 514-515
-
MapReduce: Simplified Data Processing on Large Clusters
OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA (2004), pp. 137-150
-
Web Search for a Planet: The Google Cluster Architecture
Luiz Andre Barroso, Jeffrey Dean, Urs Hölzle
IEEE Micro, vol. 23 (2003), pp. 22-28
