This article is a follow-up to Vivek Rau's chapter "Eliminating Toil" in Site
Reliability Engineering: How Google Runs Production Systems. We begin by recapping
Vivek's definition of toil and Google's approach to balancing operational work with
engineering project work. The Bigtable SRE case study then presents a concrete
example of how one team at Google went about reducing toil. Finally, we leave
readers with a series of best practices that should be helpful in reducing toil no
matter the size or makeup of the organization.