Rate this book

Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications

Name: Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications
Rating: 4.18 (9 reviews)
ISBN: 9781491974292

Fabian Hueske, Vasiliki Kalavri

Rate this book

Get started with Apache Flink, the open source framework that powers some of the world’s largest stream processing applications. With this practical book, you’ll explore the fundamental concepts of parallel stream processing and discover how this technology differs from traditional batch data processing. Longtime Apache Flink committers Fabian Hueske and Vasia Kalavri show you how to implement scalable streaming applications with Flink’s DataStream API and continuously run and maintain these applications in operational environments. Stream processing is ideal for many use cases, including low-latency ETL, streaming analytics, and real-time dashboards as well as fraud detection, anomaly detection, and alerting. You can process continuous data of any kind, including user interactions, financial transactions, and IoT data, as soon as you generate them.

GenresProgrammingTechnologyComputer ScienceTechnicalSoftware

308 pages, Paperback

Published April 30, 2019

77 people are currently reading

230 people want to read

About the author

Fabian Hueske

1 book6 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

39 (39%)

4 stars

43 (43%)

3 stars

16 (16%)

2 stars

1 (1%)

1 star

1 (1%)

Displaying 1 - 9 of 9 reviews

Ian Wagner

70 reviews3 followers

February 27, 2022

Probably the best book available on the subject, which is a bit unfortunate. It wasn't awful or anything, but I found myself frequently stopping, reading the docs, and/or Googling in frustration because explanations and warnings simply prompted further questions that seemed obvious to me, but were not adequately explored. Not the easiest ecosystem to break into, admittedly, but I would still say this is probably the best organized intro available.

In particular, one of my largest criticisms (applies to the JVM ecosystem as a whole *way* more than most for some reason) is the amount of (IMO) unreasonable assumptions made. Terms are frequently thrown around without proper treatment. I actually got through the entire book without feeling like I had a complete understanding of what an operator was. The Flink glossary wasn't all that much more helpful, but at least had something. Do yourself a favor and read the Google Dataflow Model paper first and you'll get a *much* more thorough introduction to some of the crucial terms.

Finally, though no fault of the author, this book is old. Flink has evolved significantly since this was written, some of the APIs are deprecated, and some of the other cautions are either inaccurate or difficult to verify (I still can't figure out whether the limitation re: parallelism settings and savepoints is still valid... the book claims it was written for Flink 1.7, but the only limitations I can find in the official docs reference version 1.2).

computer-science

Leo Fischer

72 reviews2 followers

November 17, 2020

A very comprehensive book on the ins-and-outs of Flink and I read it cover to cover. I have found myself flipping through it as a reference on many occasions when I am curious about some specific implementation detail. I give it 4 stars only because several important releases have been made since its publication in 2019 and it is dated as some of the most important new features are not included.

Praveen Gorthy

3 reviews3 followers

June 22, 2019

A great resource for anyone interested in concepts of stream processing and in depth tour of flink

Raymond Lewis

176 reviews

November 25, 2023

Seems like a great resource for learning Flink from high level to implementation.

Senjin Hajrulahovic

55 reviews

February 21, 2025

Solid and concise introduction to Apache Flink.

Marcin Kuthan

15 reviews10 followers

October 13, 2022

Just a documentation collected into the book, perhaps partially outdated now. The best part of the book is about general streaming challenges and trade-offs, watermarks, sources/sink design, state management. I found this book interesting even if you develop streaming pipelines using different frameworks (beam, kafka streams) just to compare the APIs, capabilities and limitations.

I was really surprised that there is no single page about automated tests. When I evaluate a new framework excellent support for automated tests is a must. I don’t understand why so important aspect was totally ignored.

0vai5

28 reviews5 followers

November 14, 2019

- An approachable and practical introduction with nice examples throughout the book!
- It first presents the overall architecture and then we cover datastreaming api in the later chapters with each one focussing on one aspect.
- I enjoyed the chapter on integrating flink with other systems like kafka, cassandra etc. and how event guarantees are effected depending on the source and sink. This gives an overall idea of how such systems are deployed especially for the beginners who might not have holistic picture of distributed systems.
- One thing that could have made this book awesome - a smallish hands-on project in the end covering many of the concepts presented throughout. I think this in itself is quite a big task and perhaps deserves its own book.

Łukasz Słonina

124 reviews25 followers

July 3, 2019

Very good introduction to stream processing and Flink itself (only DataStream part).

The only thing I would change is examples, maybe in Scala they're more concise, but in Java they would be much more readable and easier to follow.

Mandatory position for Flink users.