Take me to your leader! Online Optimization of Distributed Storage Configurations
Venue
Proceedings of the 41st International Conference on Very Large Data Bases, VLDB Endowment (2015), pp. 1490-1501
Publication Year
2015
Authors
Artyom Sharov, Alexander Shraer, Arif Merchant, Murray Stokely
BibTeX
Abstract
The configuration of a distributed storage system typically includes, among other
parameters, the set of servers and their roles in the replication protocol.
Although mechanisms for changing the configuration at runtime exist, it is usually
left to system administrators to manually determine the “best” configuration and
periodically reconfigure the system, often by trial and error. This paper describes
a new workload-driven optimization framework that dynamically determines the
optimal configuration at runtime. We focus on optimizing leader and quorum based
replication schemes and divide the framework into three optimization tiers,
dynamically optimizing different configuration aspects: 1) leader placement, 2)
roles of different servers in the replication protocol, and 3) replica locations.
We showcase our optimization framework by applying it to a large-scale distributed
storage system used internally in Google and demonstrate that most client
applications significantly benefit from using our framework, reducing average
operation latency by up to 94%.
