Mindaugas Mozūras

9%
Flag icon
Managing service reliability is largely about managing risk, and managing risk can be costly. 100% is probably never the right reliability target: not only is it impossible to achieve, it’s typically more reliability than a service’s users want or notice. Match the profile of the service to the risk the business is willing to take. An error budget aligns incentives and emphasizes joint ownership between SRE and product development. Error budgets make it easier to decide the rate of releases and to effectively defuse discussions about outages with stakeholders, and allows multiple teams to ...more
Site Reliability Engineering: How Google Runs Production Systems
Rate this book
Clear rating
Open Preview