SRE has found that roughly 70% of outages are due to changes in a live system. Best practices in this domain use automation to accomplish the following: Implementing progressive rollouts Quickly and accurately detecting problems Rolling back changes safely when problems arise