Ory

61%
Flag icon
In this chapter we explored the topic of batch processing. We started by looking at Unix tools such as awk, grep, and sort, and we saw how the design philosophy of those tools is carried forward into MapReduce and more recent dataflow engines. Some of those design principles are that inputs are immutable, outputs are intended to become the input to another (as yet unknown) program, and complex problems are solved by composing small tools that “do one thing well.” In the Unix world, the uniform interface that allows one program to be composed with another is files and pipes; in MapReduce, that ...more
This highlight has been truncated due to consecutive passage length restrictions.
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate this book
Clear rating
Open Preview