Tenzing is a query engine built on top of MapReduce for ad hoc analysis of Google
data. Tenzing supports a mostly complete SQL implementation (with several
extensions) combined with several key characteristics such as heterogeneity, high
performance, scalability, reliability, metadata awareness, low latency, support for
columnar storage and structured data, and easy extensibility. Tenzing is currently
used internally at Google by 1000+ employees and serves 10000+ queries per day over
1.5 petabytes of compressed data. In this paper, we describe the architecture and
implementation of Tenzing, and present benchmarks of typical analytical queries.