Jeff Ryan

32%
Flag icon
We find that (1) operator error is the largest single cause of failures in two of the three services, (2) operator errors often take a long time to repair, (3) configuration errors are the largest category of operator errors, (4) failures in custom-written front-end software are significant, and (5) more extensive online testing and more thoroughly exposing and detecting component failures would reduce failure rates in at least one service.
Practice of Cloud System Administration, The: DevOps and SRE Practices for Web Services, Volume 2
Rate this book
Clear rating
Open Preview