Shasta: Interactive Reporting at Scale
SIGMOD, San Francisco, CA (2016) (to appear)
Gokul Nath Babu Manoharan, Stephan Ellner, Karl Schnaitter, Sridatta Chegu, Alejandro Estrella-Balderrama, Stephan Gudmundson, Apurv Gupta, Ben Handy, Bart Samwel, Chad Whipkey, Larysa Aharkava, Himani Apte, Nitin Gangahar, Jun Xu, Shivakumar Venkataraman, Divy Agrawal, Jeffrey D. Ullman
Designed as a layer on top of Google’s F1 RDBMS and Mesa data warehouse, Shasta combines language and system techniques to meet these requirements. To help with expressing complex view specifications, we developed a query language called RVL, with support for modularized view templates that can be dynamically compiled into SQL. To execute these SQL queries with low latency at scale, we leveraged and extended F1’s distributed query engine with facilities such as safe execution of C++ and Java UDFs. To reduce latency and increase read parallelism, we extended F1 storage with a distributed read-only in-memory cache. The system we describe is in production at Google, powering critical applications used by advertisers and internal sales teams. Shasta has significantly improved system scalability and software engineering efficiency compared to the middleware solutions it replaced.