Omega: flexible, scalable schedulers for large compute clusters
Venue
SIGOPS European Conference on Computer Systems (EuroSys), ACM, Prague, Czech Republic (2013), pp. 351-364
Publication Year
2013
Authors
Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, John Wilkes
BibTeX
Abstract
Increasing scale and the need for rapid response to changing requirements are hard
to meet with current monolithic cluster scheduler architectures. This restricts the
rate at which new features can be deployed, decreases efficiency and utilization,
and will eventually limit cluster growth. We present a novel approach to address
these needs using parallelism, shared state, and lock-free optimistic concurrency
control. We compare this approach to existing cluster scheduler designs, evaluate
how much interference between schedulers occurs and how much it matters in
practice, present some techniques to alleviate it, and finally discuss a use case
highlighting the advantages of our approach -- all driven by real-life Google
production workloads.