Rate this book

Effective Monitoring and Alerting: For Web Operations

Name: Effective Monitoring and Alerting: For Web Operations
Rating: 3.66 (13 reviews)
ISBN: 9781449333485

Slawek Ligus

Rate this book

With this practical book, you’ll discover how to catch complications in your distributed system before they develop into costly problems. Based on his extensive experience in systems ops at large technology companies, author Slawek Ligus describes an effective data-driven approach for monitoring and alerting that enables you to maintain high availability and deliver a high quality of service. Learn methods for measuring state changes and data flow in your system, and set up alerts to help you recover quickly from problems when they do arise. If you’re a system operator waging the daily battle to provide the best performance at the lowest cost, this book is for you.

GenresTechnologyComputer ScienceTechnicalNonfictionReference

166 pages, ebook

First published November 22, 2012

31 people are currently reading

119 people want to read

About the author

Slawek Ligus

1 book1 follower

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

16 (21%)

4 stars

24 (32%)

3 stars

28 (37%)

2 stars

5 (6%)

1 star

1 (1%)

Displaying 1 - 13 of 13 reviews

Ieva Gr

185 reviews33 followers

February 2, 2020

Was it easy to read: Not really. The ideas weren’t complex. But they were somehow formulated in a technical manner that made me re-read a lot of sentences to finally understand them.

What I liked about it: That it gives some practical, actionable tips e.g. on plotting and understanding graphs (the typical quantities metrics represent, summary statistics best fit for those quantities), setting up alerts, calculating thresholds. Also I think this book really shows how mature monitoring culture should look like: monitor extensively – alert selectively.

What I disliked: The author is a devops so naturally the book is written from devops perspective, most examples are about CPU usage and etc. I as a developer would be more happy with application monitoring examples. Plus the language was more complex than it needed to be, as I mentioned before.

Ideas/ Quotes:
“Monitor extensively and alert selectively: identify what metrics drive your business and work top-down to setup alarms around timeseries behind KPIs”

“Ideally, monitoring should enable operators to drill down from high level overview into the fine levels of details, granular enough to point at specifics”

“Flawed assumption: the ticket generated from an alert is a unit of work rather than an indication of a problem in the system”

“All alarms that trigger on non-issue should be done away if there is no evidence that the resulting alerts are actionable. If this policy is not followed, false alarms will cause more harm than good. There are only 2 ways one can respond to non-issues: ignore it or overreact”

“Measuring quality most not be effort-full, otherwise quality assessment will come at a very high cost and with dubious credibility”

huydx

33 reviews14 followers

February 22, 2017

Short book, but the book comes with some useful information and some best practice of building effective monitoring system.

Bradley White

182 reviews

September 1, 2025

Excellent and concise practical book on setting up monitoring of your IT services. Full of technical advice without getting bogged down with any particular monitoring systems and software.

Recommended.

2025

Romain

908 reviews55 followers

November 30, 2018

A short note about this book I used in my work. First of all two good points. The first is that it deals with monitoring, alerting and reporting in general, that is to say independently of the tools used. This is both a strong point and a weak point since it could be useful to identify families of tools adapted to each use. This step back is not so common and allows to introduce higher level concepts, for example the organization of the monitoring in stacks which is absolutely crucial but also notions and general definitions applicable in all circumstances - or almost. And we come to the second strong point, definitions. It is essential in the professional context to rely on precise definitions that allow framing concepts that most people have an unfortunate tendency to confuse as monitoring and alerting, for example.

In the weak points, it lacks background and practical cases. If we do not know the subject well, we will finish reading about as bad -- I exaggerate, we will at least be armed with definitions and concepts and that's already a lot. The writing is completely devoid of soul: no humor, no anecdotes which makes reading quite boring.

The most problematic point is the structure of the book that is really unclear and will not permit to refer to it easily to find an element. More importantly, it lacks structuring elements for the implementation of a solution like the 4 golden signals.

> - Latency: The time it takes to service a request, with a focus on distinguishing between the latency of successful requests and the latency of failed requests.
> - Traffic: A measure of how much demand is being placed on the service. This is measured using a high-level service-specific metric, like HTTP requests per second in the case of an HTTP REST API.
> - Errors: The rate of requests that fail. The failures can be explicit (e.g., HTTP 500 errors) or implicit (e.g., an HTTP 200 OK response with a response body having too few items).
> - Saturation: How “full” is the service. This is a measure of the system utilization, emphasizing the resources that are most constrained (e.g., memory, I/O or CPU). Services degrade in performance as they approach high saturation.

Or the 5 golden signals if we add to that the measure of availability -- I wrote an article about it. Despite these reservations, it is still useful to have it on hand to refer to it from time to time, but it is not a must have -- far from it. I would rather read with interest a recent book published by the same editor: Practical Monitoring: Effective Strategies for the Real World.

https://www.back2code.me/2018/11/effe...
http://aubonroman.com/2018/10/effecti...

c_s_ops

Daniel

12 reviews3 followers

January 7, 2016

I was really looking forward to this book as I've heard good things about it and thought it would round up what I already knew about the topic. However right from the start it felt rather awkward. The author is trying to maintain an abstract high level view on monitoring and alerting and not go into specific implementations. This makes for an awkward combination with it being basically a 101/introductory book on the topic. A lot of the formal descriptions of monitoring and alerting feel forced and don't hold up in the abstract very well and are too high level to be practical. He also talks about operations in an almost romantic hero style way which I didn't enjoy. In addition to that the book also includes some final chapters on outage handling and organizational and cultural setups. The terms human error, root cause analysis, and "5 Whys" are thrown around a lot with no acknowledgement of it being actually harmful to learning according to modern research in the field of systems safety. Definitely not a book I would recommend.

This entire review has been hidden because of spoilers.

Mark McGranaghan

25 reviews20 followers

February 19, 2014

A very solid overview of monitoring and alerting for online services. The book covers both the engineering aspects of setting up and configuring monitors and alerts as well as the management aspects of collecting data over time, using it to implement continuous improvement, etc.

I appreciated that the book covered monitoring and alerting in general and wasn't specific to a particular system or toolchain.

Highly recommended for both beginners and experienced engineers dealing with monitoring and alerting in production.

Craig Demyanovich

2 reviews3 followers

November 5, 2014

Something about the writing style made it hard to stay interested along the way. It might have been that I felt like was so often reading that an idea was about to broken down into a number of smaller components. Some long chapters didn't help, either.

Another gripe is a large number of grammatical errors. The editor(s) should've done a better job with these.

In the printed book, the graphs and figures were often very hard to read. The colors used were hard to distinguish compared to the PDF or ebook.

Leandro López

70 reviews11 followers

July 12, 2015

Good introduction to monitoring. Lacks some real examples

I was expecting a bit more examples from this book, both technical and non-technical. Still it's a very good introduction to monitoring and alerting, but I don't know if I would recommend it to someone that's starting on the subject.

Jacek

8 reviews

May 13, 2017

Practical book however not so accurate right now (2017). This technology is not updated, was replaced with other projects which are not covered in this book.

However can give you a strong background about what monitoring is and should be, what kind of types we can use and for what purposes

Charles Baker

416 reviews24 followers

May 16, 2013

Great book. Filled my head with awesome ideas. Now to implement some!

computers personal-development

Goncalo

45 reviews

April 24, 2014

great intro to the subject

Dmitry

87 reviews5 followers

June 16, 2016

Did you know there is solid theoretical background behind monitoring (sufficient for couple of PhD work)? I didnt, but it actually is, and the book proves it.