Optimizing Distributed Actor Systems for Dynamic Interactive Services
Venue
EuroSys 2016, ACM – Association for Computing Machinery (to appear)
Publication Year
2016
Authors
Andrew Newell, Gabriel Kliot, Ishai Menache, Aditya Gopalan, Soramichi Akiyama, Mark Silberstein
BibTeX
Abstract
Distributed actor systems are widely used for developing interactive scalable cloud
services, such as social networks and on-line games. By modeling an application as
a dynamic set of lightweight communicating “actors”, developers can easily build
complex distributed applications, while the underlying runtime system deals with
low-level complexities of a distributed environment. We present ActOp — a
data-driven, application-independent runtime mechanism for optimizing end-to-end
service latency of actor-based distributed applications. ActOp targets the two
dominant factors affecting latency: the overhead of remote inter-actor
communications across servers, and the intra-server queuing delay. ActOp
automatically identifies frequently communicating actors and migrates them to the
same server transparently to the running application. The migration decisions are
driven by a novel scalable distributed graph partitioning algorithm which does not
rely on a single server to store the whole communication graph, thereby enabling
efficient actor placement even for applications with rapidly changing graphs (e.g.,
chat services). Further, each server autonomously reduces the queuing delay by
learning an internal queuing model and configuring threads according to
instantaneous request rate and application demands. We prototype ActOp by
integrating it with Orleans – a popular open-source actor system [4, 13].
Experiments with realistic workloads show latency improvements of up to 75% for the
99th percentile, up to 63% for the mean, with up to 2x increase in peak system
throughput.
