MillWheel: Fault-Tolerant Stream Processing at Internet Scale
Venue
Very Large Data Bases (2013), pp. 734-746
Publication Year
2013
Authors
Tyler Akidau, Alex Balikov, Kaya Bekiroglu, Slava Chernyak, Josh Haberman, Reuven Lax, Sam McVeety, Daniel Mills, Paul Nordstrom, Sam Whittle
BibTeX
Abstract
MillWheel is a framework for building low-latency data-processing applications that
is widely used at Google. Users specify a directed computation graph and
application code for individual nodes, and the system manages persistent state and
the continuous flow of records, all within the envelope of the framework's
fault-tolerance guarantees. This paper describes MillWheel's programming model as
well as its implementation. The case study of a continuous anomaly detector in use
at Google serves to motivate how many of MillWheel's features are used. MillWheel's
programming model provides a notion of logical time, making it simple to write
time-based aggregations. MillWheel was designed from the outset with fault
tolerance and scalability in mind. In practice, we find that MillWheel's unique
combination of scalability, fault tolerance, and a versatile programming model
lends itself to a wide variety of problems at Google.