Jump to ratings and reviews
Rate this book

Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture

Rate this book
This book highlights the different types of data architecture and illustrates the many possibilities hidden behind the term "Big Data", from the usage of No-SQL databases to the deployment of stream analytics architecture, machine learning, and governance.

Scalable Big Data Architecture covers real-world, concrete industry use cases that leverage complex distributed applications , which involve web applications, RESTful API, and high throughput of large amount of data stored in highly scalable No-SQL data stores such as Couchbase and Elasticsearch. This book demonstrates how data processing can be done at scale from the usage of NoSQL datastores to the combination of Big Data distribution.

When the data processing is too complex and involves different processing topology like long running jobs, stream processing, multiple data sources correlation, and machine learning, it’s often necessary to delegate the load to Hadoop or Spark and use the No-SQL to serve processed data in real time.

This book shows you how to choose a relevant combination of big data technologies available within the Hadoop ecosystem. It focuses on processing long jobs, architecture, stream data patterns, log analysis, and real time analytics. Every pattern is illustrated with practical examples, which use the different open sourceprojects such as Logstash, Spark, Kafka, and so on.

Traditional data infrastructures are built for digesting and rendering data synthesis and analytics from large amount of data. This book helps you to understand why you should consider using machine learning algorithms early on in the project, before being overwhelmed by constraints imposed by dealing with the high throughput of Big data.

Scalable Big Data Architecture is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a Big Data project and which tools to integrate into that pattern.

266 pages, Kindle Edition

Published December 31, 2015

2 people are currently reading
24 people want to read

About the author

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
4 (33%)
4 stars
5 (41%)
3 stars
1 (8%)
2 stars
1 (8%)
1 star
1 (8%)
Displaying 1 - 3 of 3 reviews
Profile Image for Łukasz Słonina.
124 reviews26 followers
September 6, 2017
This book is in fact compilation of some tutorials how to set up some technologies to work together.
There is no clear explanation why particular decisions were made, what are the alternatives.
Definitely this is not a book about architecutre and for archiects rather description of some solution (even there is lack of description of what problem author is trying to solve)
Profile Image for Jascha.
151 reviews
July 30, 2016
In the recent years we have passed from a business model where the data had to be processed in days to a model where data must be processed near real-time, since it drives business decisions. This key role assumed by data changed the requirements as companies now need infrastructures that are able to scale in and out smoothly as well as being highly available and survive partitioning. This new scenario opens a NN . Scalable Big Data Architecture

Released last 2015, Scalable Big Data Architecture is a short but pleasant read for anyone interested in data infrastructure. I usually refer to this kind of books as soft reads: technical books that do not require you to think or focus on code or formulas. Those books you can enjoy on the train on your way back home.

Definitely well written, this title provides a broad overview of several big data scenarios and how to build an scalable infrastructure that solves that specific problem. Mind that, and more on this in a minute, the author never goes into details. The open-source components are well described, but no installation nor configuration steps are provided, neither as a standalone nor as a whole. The author focuses on discussing, step by step, carefully elaborating his choices, why this or that specific solution would fit that role in a specific scenario, highlighting the pros and cons.

For example, when discussing NoSQL the author starts discussing the good old three-tier applications that relied on transactional databases. He then explains the need for a NoSQL solution and dfiferent models and, finally, he gives the reader different possible (NoSQL) alternatives to solve the original problem. No configuration, no real examples. Plenty of colorful schemes, though.

Before tying it all up, a quick note on the description that comes with the book. Scalable Big Data Architecture is presented to the potential buyer as a book that covers real-world, concrete industry use cases. It also refers multiple times to Big Data patterns.

Neither of this is correct. In particular, this title is not about (Big Data) patterns. And this is probably the worst note of the whole book. I have personally read the book looking specifically for this. Unfortunately, despite being a nice book to read, this was not the case.

Overall, a pleasant and up-to-date read about Big Data. Not suggested to those interested specifically in Big Data infrastructure patterns.

As usual, you can find more reviews on my personal blog: books.lostinmalloc.com. Feel free to pass by and share your thoughts!
Profile Image for Jae Ho.
6 reviews8 followers
June 13, 2016
Easy to read and nice diagrams.
Displaying 1 - 3 of 3 reviews

Can't find what you're looking for?

Get help and learn more about the design.