Jump to Content

Ubiq: A Scalable and Fault-tolerant Log Processing Infrastructure

Alexander Smolyanov
Divy Agrawal
Haifeng Jiang
Manish Bhatia
Monica Chawathe Lenart
Namit Sikka
Navin Melville
Scott Holzer
Shan He
Shivakumar Venkataraman
Tianhao Qiu
Venkatesh Basker
Vinny Ganeshan
Yuri Vasilevski
Workshop on Business Intelligence for the Real Time Enterprise (BIRTE), Springer (2016)

Abstract

Most of today’s Internet applications are data-centric and generate vast amounts of data (typically, in the form of event logs) that needs to be processed and analyzed for detailed reporting, enhancing user experience and increasing monetization. In this paper, we describe the architecture of Ubiq, a geographically distributed framework for processing continuously growing log files in real time with high scalability, high availability and low latency. The Ubiq framework fully tolerates infrastructure degradation and datacenter-level outages without any manual intervention. It also guarantees exactly-once semantics for application pipelines to process logs in the form of event bundles. Ubiq has been in production for Google’s advertising system for many years and has served as a critical log processing framework for hundreds of pipelines. Our production deployment demonstrates linear scalability with machine resources, extremely high availability even with underlying infrastructure failures, and an end-to-end latency of under a minute.