(Un)Reliability Budgets: Finding Balance between Innovation and Reliability
Venue
;login:, vol. 40, No. 4 (2015), pp. 26-28
Publication Year
2015
Authors
Mark D. Roth
BibTeX
Abstract
Google is constantly changing our software to implement new, useful features for
our users. Unfortunately, making changes is inherently risky. Google services are
quite complex, and any new feature might accidentally cause problems for users. In
fact, most outages of Google services are the result of deploying a change. As a
consequence, there is an inherent tension between the desire to innovate quickly
and to keep the site reliable. Google manages this tension by using a metrics-based
approach called an unreliability budget, which provides an objective metric to
guide decisions involving tradeoffs between innovation and reliability.