Mindaugas Mozūras

26%
Flag icon
Novice pilots are taught that their first responsibility in an emergency is to fly the airplane [Gaw09]; troubleshooting is secondary to getting the plane and everyone on it safely onto the ground. This approach is also applicable to computer systems: for example, if a bug is leading to possibly unrecoverable data corruption, freezing the system to prevent further failure may be better than letting this behavior continue.
Site Reliability Engineering: How Google Runs Production Systems
Rate this book
Clear rating
Open Preview