Imagine you woke up one day and found yourself responsible for a Site Reliability Engineering team. By 10AM, you’ve downloaded a free copy of the SRE book, and are starting to get the hang of things. Then an incident strikes: oh no! Folks rally to mitigate user impact, shortly followed by diagnosing and remediating the underlying cause. The team's response was amazing, but your users depend on you and you feel like today you let them down. Your shoulders are a bit heavier than just a few hours ago...
Published on November 05, 2019 05:00