Operating distributed systems at scale requires an unusual set of skills—problem
solving, programming, system design, networking, and OS internals—which are
difficult to find in one person. At Google, we’ve found some ways to hire Site
Reliability Engineers, blending both software and systems skills to help keep a
high standard for new SREs across our many teams and sites, including standardizing
the format of our interviews and the unusual practice of making hiring decisions by
committee. Adopting similar practices can help your SRE or DevOps team grow by
consistently hiring excellent coworkers.