Jump to ratings and reviews
Rate this book

Data Pipelines with Apache Airflow

Rate this book
A successful pipeline moves data efficiently, minimizing pauses and blockages between tasks, keeping every process along the way operational. Apache Airflow provides a single customizable environment for building and managing data pipelines, eliminating the need for a hodge-podge collection of tools, snowflake code, and homegrown processes. Using real-world scenarios and examples, Data Pipelines with Apache Airflow teaches you how to simplify and automate data pipelines, reduce operational overhead, and smoothly integrate all the technologies in your stack.

480 pages, Paperback

Published January 1, 2020

49 people are currently reading
272 people want to read

About the author

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
49 (44%)
4 stars
52 (46%)
3 stars
9 (8%)
2 stars
1 (<1%)
1 star
0 (0%)
Displaying 1 - 12 of 12 reviews
Profile Image for Sebastian Gebski.
1,190 reviews1,343 followers
June 13, 2021
The best book on the topic. What of course doesn't mean it's perfect.

Very good composition, the content is clear, approachable, illustrated with enough practical examples. I was a bit surprised to find out that the actual Airflow architecture is covered near the end of the book, but finally it appeared not to be an issue.

The book covers a proper intro to Airflow, describes its conceptual elements (like task groups), integrations, deployment options (all the major cloud ones), securing, testing, etc.

What would I improve in this book?
1. I'd add a chapter with architecture advice - how to organize idempotent processing pipelines at scale (esp. data and its split - conceptually Airflow doesn't bother with those and some conventions/best practices have to be put in place).
2. Airflow alternatives - where Airflow shines and which scenarios it is not the best equipped to address (e.g. comparison with NiFi or Luigi)
3. AWS scenario covers manual deployment, not a managed scenario.

Solid 4.5 stars. Recommended.
Profile Image for Philip.
6 reviews
June 2, 2021
I used this book to ramp up as a contributor to an existing Airflow implementation. Solid explanations with supporting examples for beginning topic through advanced setups. In fact, I found myself referencing this book more often than the official documentation, which is often scattered and incomplete. By contrast, this book is well organized and useful both as an introductory text and as a reference. Kudos to the authors (and editors, presumably) for managing to strike that difficult balance!
Profile Image for Omid Milanifard.
387 reviews42 followers
July 27, 2024
It is a comprehensive guide to mastering the orchestration of data workflows using Apache Airflow. The book starts with foundational concepts and progresses to advanced techniques, covering Airflow's architecture and how its components interact. It includes practical examples and step-by-step tutorials for installation, configuration, and deployment.
The author provides real-world scenarios, demonstrating how Airflow can be applied to various data engineering tasks. Key topics include best practices for designing and managing data pipelines, handling dependencies, retries, and error handling. The book also covers extending Airflow with custom operators and sensors, as well as performance optimization and scaling.
Profile Image for Evan Oman.
31 reviews2 followers
May 14, 2021
This book was precisely what I needed to get up to speed with Airflow quickly. It covers core principles, best practices, testing patterns, productionization considerations, cloud deployment patterns, and much more. Furthermore, this book had some of the best example problems I've seen in a technical text: complex enough to be useful, interesting enough to grab your attention, but scoped small enough to be understood.

Their code had a few strange implementation quirks to prove a point or show a use case, but these were few and far between.

Overall an easy 5/5.
Profile Image for Antoni Heba.
11 reviews
July 17, 2025
The only book about Airflow out there. It's decent and full of useful information, but covers too much ground and has an outdated cloud part.

After reading it I can certainly write that the main goal has been achieved: I learned about Airflow. The book starts with a good introduction to the basic stuff, followed by more advanced topics like best practices for developing, testing and about the inner workings of Airflow. The style is practical, there's lots of code examples and drawings that make understanding easier.

Unfortunately, the authors wanted to cover too much. That's why some topics are glossed over while others are superfluous, like explaining LDAP. Other topics are mentioned twice, like SLAs. I suspect the two authors did not coordinate enough on some points. The last part about Airflow in the cloud is largely outdated. While for AWS it does mention MWAA, to my surprise it goes on to describe a Fargate deployment and is thus almost useless.
Profile Image for Leo.
325 reviews26 followers
August 15, 2023
It's a detailed guide, and if you want to get deeper into understanding of Apache Airflow - there's probably no competition to it at all.
Sometimes it might be too detailed, so I'd advice readers to safely skip / skim through the topics they're not super-interested in. This should be more your "companion" while you're ramping up with Airflow, rather than trying to learn it from A to Z before writing your first pipeline.
Profile Image for Krzysztof.
19 reviews2 followers
July 25, 2022
Extensive and practical Airflow guide, which definitely helps to organise one's knowledge of the topic.

The only downside is that code provided with the book is not entirely up-to-date and tested so you could have some problems playing with some exemplary projects.
12 reviews
May 2, 2021
It’s a excelente book to learn Apache Airflow. From beginning to deploy and maintains your dags and Airflow
Profile Image for Alex Ott.
Author 3 books207 followers
December 25, 2021
good intro into apache airflow - from concepts to pipeline examples, and to administration & best practices
11 reviews
October 10, 2022
what a good intro into DE, but before read and understand it, you should a foundation about programing (especially Py), working with file, dict, ...
Read it on O'Reilly
Profile Image for Nickolai.
893 reviews8 followers
February 4, 2025
Пособие очень толково написано. Форма подачи материала понятна и намного доступнее, чем в мануале самого Airflow. Открыл для себя пару интересных моментов, которые можно использовать в работе.
Displaying 1 - 12 of 12 reviews

Can't find what you're looking for?

Get help and learn more about the design.