Jump to ratings and reviews
Rate this book

Cloud Observability in Action

Rate this book
Generate actionable insights about your cloud native systems. This book teaches you how to set up an observability system that learns from a cloud application’s signals, logging, and monitoring using free and open source tools.

In Cloud Observability in Action you will learn how


Cloud native, serverless, and containerized applications are made up of hundreds of moving parts. When something goes wrong, it’s not enough to just know there is a problem—you need to know where it is, what it is, and even how to fix it. Cloud Observability in Action shows you how to go beyond the traditional monitoring and build observability systems that turn application telemetry into actionable insight.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the technology
A well-designed observability system provides insight into bugs and performance issues in cloud native applications. Often, observability is the difference between an error message and an explanation! You know exactly which service is affected, who’s responsible for its repair, and even how it can be optimized in the future. Best of all, observability allows you to easily automate your error handling with machine users applying fixes without any human help.

About the book
Cloud Observability in Action teaches you to apply observability practices to cloud-based serverless and Kubernetes environments. In this one-of-a-kind guide, author Michael Hausenblas shares insights from his extensive experience building, monitoring, and improving cloud native systems.

You’ll use open source tools like Prometheus and Grafana to build your own observability system without having to rely on proprietary software. Learn how to use telemetry and destinations to continuously generate and discover insights from different signals, including logs, metrics, traces, and profiles. Throughout, use cases and rigorous cost-benefit analysis make sure you’re getting a real return on your investment in observability.

About the reader
For developers and SREs who have worked with cloud native applications. This book can be used with any public cloud.

About the author
Michael Hausenblas is a Solution Engineering Lead in the AWS open source observability service team. He covers Prometheus, Grafana, and OpenTelemetry upstream and in managed services. Before Amazon, Michael worked at Red Hat, Mesosphere (now D2iQ), and MapR.

264 pages, Paperback

Published December 26, 2023

3 people are currently reading
33 people want to read

About the author

Michael Hausenblas

18 books12 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
5 (29%)
4 stars
9 (52%)
3 stars
2 (11%)
2 stars
0 (0%)
1 star
1 (5%)
Displaying 1 - 4 of 4 reviews
Profile Image for Phil Wilkins.
Author 2 books4 followers
January 4, 2024
I got this as an ebook direct from the Publisher - Manning https://www.manning.com/books/cloud-o...

Cloud Observability In Action has been an easygoing and enjoyable read. Tech books can sometimes get a bit heavy going or dry, not the case here. Firstly, Michael went back to first principles, making the difference between Observability and monitoring - something that often gets muddied (and I've been guilty of this, as the latter is a subset of the former). Observability doesn't roll off the tongue as smoothly as monitoring (although I rather like the trend of using O11y). This distinction, while helpful, particularly if you're still finding your feet in this space, is good. What is more important is stepping back and asking what should we be observing and why we need to observe it. Plus, one of my pet points when presenting on the subject - we all have different observability needs - as a developer, an ops person, security, or auditors.

Next is Michael's interesting take on how much O11y code is enough. Historically, I've taken the perspective - that enough is a factor of code complexity. More complex code - warrants more O11y or logging as this is where bugs are most likely to manifest themselves; secondly, I've looked at transaction and service boundaries. The problem is this approach can sometimes generate chatty code. I've certainly had to deal with chatty apps, and had to filter out the wheat from the chaff. So Michael's approach of cost/benefit and measuring this using his B2I ratio (how much code is addressing the business problems over how much is instrumentation) was a really fresh perspective and presented in a very practical manner, with warnings about using such a measure too rigidly. It's a really good perspective as well if you're working on hyperscaling solutions where a couple of percentage point improvements can save tens of thousands of dollars. Pretty good going, and we're only a couple of chapters into the book.

The book gets into the underlying ideas and concepts that inform OpenTelemetry, such as traces and spans, metrics, and how these relate to Observability. Some of the classic mistakes are called out, such as dimensioning metrics with high cardinality and why this will present real headaches for you.

As the data is understood, particularly metrics you can start to think about how to identify what normal is, what is abnormal, or an outlier. That then leads to developing Service Level Objectives (SLOs), such as an acceptable level of latency in the solution or how many errors can be tolerated.

The book isn't all theory. The ideas are illustrated with small Go applications, which are instrumented, and the generated metrics, traces, and logs. Rather than using a technology such as Fluentd or Fluent Bit, Michael starts by keeping things simple and directly connecting the gathering of the metrics into tools such as Prometheus, Zipkin, Jaeger, and so on. In later chapters, the complexity of agents, aggregators, and collectors is addressed. Then, the choices and considerations for different backend solutions from cloud vendor-provided services such as OpenSearch, ElasticSearch, Splunk, Instana and so on. Then, the front-end visualization of the data is explored with tools such as Grafana, Kibana, cloud-provided tools, and so on.

As the book progresses, the chapters drill down into more detail, such as the differences and approaches for measuring containerized solutions vs. serverless implementations such as Lambda and the kinds of measures you may want. The book isn't tied to technologies typically associated with modern Cloud Native solutions, but more traditional things like relational databases are taken into account.

The closing chapters address questions such as how to address alerting, incident management, and implementing SLOs. How to use these techniques and tools can help inform the development processes, not just production.

So I would recommend the book, if you're trying to understand Observability (regardless of a cloud solution or not). If you're trying to advance from the more traditional logging to a fuller capability, then this book is a great guide, showing what, why, and how to evaluate the value of doing so.
Profile Image for Yifan Yang.
45 reviews7 followers
June 29, 2024
The content is very disappointing to me. The book touches on most, if not all, areas of observability in today's cloud landscape, providing a broad overview of the industry and the different technologies specific to each area. But that's where it ends.

I was expecting some in-depth knowledge from an experienced engineer, so every time the basic introduction ended in each chapter and I thought the substantial content was about to come, it never did. Most of the time, the author just briefly introduced some buzzwords without providing concrete examples, and often references other books or blogs with statements like, "if you are interested in details on this topic, I recommend reading ...". As a result, I found that what I've learned from this book is even less than what is available in the "Concept" section of the OpenTelemetry official documentation.

In addition to the shallow knowledge, I found it difficult to follow everything the author wanted to cover. Some important definitions are not thoroughly explained before being used throughout the rest of the book. For example, instrumentation is a critical concept in observability, but the author never explained what it means before using it everywhere. And in Chap 10 the user explained the difference among SLI, SLO and SLA. But when it comes to SLO, before explaining SLA, the author is saying "(SLO) in contast to SLAs..." so I think it would be hard to understand it for people without prior knowledge of SLA. And I suggest adding more architectural diagrams alongside the practical examples to help readers understand how everything connects.

Overall, my impression of this book is that it resembles a poorly written blog without much valuable insight, and I would not recommend it to engineers of any level.
Profile Image for PABLO CHACIN.
2 reviews
January 4, 2024
The author makes a good balance between presenting fundamental concepts in detail and offering practical advice and examples. The book will be of most use for beginners and intermediate readers that want to develop their understanding of observability. More advanced readers still will find it useful as the author makes a great job synthesizing important concepts.

The main reason I'm giving it 4 starts is because the content feels somehow disconnected from chapter to chapter. I'm missing an introduction with the big-picture of the observability infrastructure that is developed along the book. Also, I find a little bit disconcerting finding some random screen capture of cloud vendor observability solutions (mostly AWS) that are not really explained in detail.
Profile Image for Raflezja.
3 reviews
January 3, 2025
Decent overview, but what I find the most useful here are the links to various resources (like public dashboards and PromQL queries)
Displaying 1 - 4 of 4 reviews

Can't find what you're looking for?

Get help and learn more about the design.