Jump to ratings and reviews
Rate this book

Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture by Bahaaldine Azarmi

Rate this book
Most people think that Big Data projects start directly with the deployment of large distributed clusters of heavy map reduce jobs, whereas reality shows that there isn’t any unique/perfect solution to solving problems when dealing with large volumes of data. By knowing the different Big Data integration patterns, you will understand why most of the time you will have to deploy a heterogeneous architecture that fulfills different needs, and furthermore what limits each pattern that may lead you to choose effective alternates. We will go through real concrete industry use cases that leverage these patterns such as REST API which requests large amount of data stored in No-SQL like Couchbase and Elasticsearch. We will see how massive data processing can be done in such No-SQL databases without the need of diving deep into Big Data. But when the volume is too high and the data structures gets too complex, the kind of pattern being employed reaches its limits and that’s when we can start thinking of delegating complex data processing jobs to, for example, a Hadoop based Big Data architecture. The difficulty is to then choose a relevant combination of big data technologies available within the Hadoop ecosystem. We will focus on processing long jobs, architecture, stream data patterns, log analysis, and real time analytics. Every pattern will be illustrated with practical examples, which uses the different apache projects such as Avro, Spark, Kafka, and so on. Traditional Big Data infrastructures are built for digesting and rendering data synthesis and analytics from large amount of data. This book will also help you to understand why you should consider using machine learning algorithms early on in the project, before being overwhelmed by constraints implied by dealing with high throughput of Big data. Progressive Big Data Architecture is for developers, data architects, data scientists looking for a better understanding of how to choose the most relevant architecture/pattern for a Big Data project and also what are the tools and projects, which should be integrated in this pattern.

Paperback

2 people are currently reading
24 people want to read

About the author

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
4 (33%)
4 stars
5 (41%)
3 stars
1 (8%)
2 stars
1 (8%)
1 star
1 (8%)
Displaying 1 - 3 of 3 reviews
Profile Image for Łukasz Słonina.
124 reviews26 followers
September 6, 2017
This book is in fact compilation of some tutorials how to set up some technologies to work together.
There is no clear explanation why particular decisions were made, what are the alternatives.
Definitely this is not a book about architecutre and for archiects rather description of some solution (even there is lack of description of what problem author is trying to solve)
Profile Image for Jascha.
151 reviews
July 30, 2016
In the recent years we have passed from a business model where the data had to be processed in days to a model where data must be processed near real-time, since it drives business decisions. This key role assumed by data changed the requirements as companies now need infrastructures that are able to scale in and out smoothly as well as being highly available and survive partitioning. This new scenario opens a NN . Scalable Big Data Architecture

Released last 2015, Scalable Big Data Architecture is a short but pleasant read for anyone interested in data infrastructure. I usually refer to this kind of books as soft reads: technical books that do not require you to think or focus on code or formulas. Those books you can enjoy on the train on your way back home.

Definitely well written, this title provides a broad overview of several big data scenarios and how to build an scalable infrastructure that solves that specific problem. Mind that, and more on this in a minute, the author never goes into details. The open-source components are well described, but no installation nor configuration steps are provided, neither as a standalone nor as a whole. The author focuses on discussing, step by step, carefully elaborating his choices, why this or that specific solution would fit that role in a specific scenario, highlighting the pros and cons.

For example, when discussing NoSQL the author starts discussing the good old three-tier applications that relied on transactional databases. He then explains the need for a NoSQL solution and dfiferent models and, finally, he gives the reader different possible (NoSQL) alternatives to solve the original problem. No configuration, no real examples. Plenty of colorful schemes, though.

Before tying it all up, a quick note on the description that comes with the book. Scalable Big Data Architecture is presented to the potential buyer as a book that covers real-world, concrete industry use cases. It also refers multiple times to Big Data patterns.

Neither of this is correct. In particular, this title is not about (Big Data) patterns. And this is probably the worst note of the whole book. I have personally read the book looking specifically for this. Unfortunately, despite being a nice book to read, this was not the case.

Overall, a pleasant and up-to-date read about Big Data. Not suggested to those interested specifically in Big Data infrastructure patterns.

As usual, you can find more reviews on my personal blog: books.lostinmalloc.com. Feel free to pass by and share your thoughts!
Profile Image for Jae Ho.
6 reviews8 followers
June 13, 2016
Easy to read and nice diagrams.
Displaying 1 - 3 of 3 reviews

Can't find what you're looking for?

Get help and learn more about the design.