Goodreads helps you keep track of books you want to read.
Start by marking “Hadoop: The Definitive Guide” as Want to Read:
Hadoop: The Definitive Guide
Enlarge cover
Rate this book
Clear rating
Open Preview

Hadoop: The Definitive Guide

3.94  ·  Rating Details ·  527 Ratings  ·  42 Reviews
Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for a
Kindle Edition, 626 pages
Published (first published May 1st 2009)
More Details... edit details

Friend Reviews

To see what your friends thought of this book, please sign up.

Reader Q&A

To ask other readers questions about Hadoop, please sign up.

Be the first to ask a question about Hadoop

This book is not yet featured on Listopia. Add this book to your favorite list »

Community Reviews

(showing 1-30)
filter  |  sort: default (?)  |  Rating Details
Ahmed Attyah
May 28, 2012 Ahmed Attyah marked it as to-read  ·  review of another edition
Shelves: programming
i got really interested about Hadoop, that is why i started reading this book :), there are only 3 books about Hadoop, and from reviews i read looks like this one is the best.
Todd N
Apr 23, 2012 Todd N rated it it was amazing  ·  review of another edition
Shelves: kindle, big-data
This is the single best reference guide to Hadoop and related projects, and it's the only O'Reilly book I have read cover to cover.

Here is the way I recommend reading it: Read through the first two chapters including the tutorial walk through with the weather examples, then jump ahead and read the introduction for each of the related projects Pig (chapter 11), Hive (12), HBase (13), Zookeeper (14), Sqoop (15). Then read the case studies in the last chapter. Then go back and read about Hadoop in
Veselin Nikolov
Aug 16, 2010 Veselin Nikolov rated it it was amazing  ·  review of another edition
Обяснява някои концепции на NoSQL, както и идеологията на Hadoop, като на места навлиза в детайли отвъд моите интереси.

Ако я сравня с "Hadoop Pro", която въобще не ми беше полезна, тази трябва да има 5 звездички. Въпреки това има известни пропуски, примерно няма информация за Hive, а тази за HBase e ограничена.

За целите на дипломната ми работа и първоначално запознаване с технологията, книгата е повече от достатъчна, а и още няма алтернатива.
Alex Ott
Very good book, that allows to get high level overview of Hadoop, and related projects, together with description of other Hadoop-related projects - Pig, HBase, and other.
I'll recommend this book for all developers, who want to learn about Hadoop, it's usage and programming for it
Andrey Vykhodtsev
The best book on Hadoop I've seen so far. Excellent overview of all Hadoop related technologies. A go-to guide for anyone who wants to educate herself on Hadoop/Big Data.
Miêu Tặc
Jan 17, 2015 Miêu Tặc rated it it was amazing  ·  review of another edition
Recommends it for: software engineer, software architect, developer
Shelves: technology
The book opens the door to Hadoop world and guides you to major places such as HDFS, Map Reduce, Hive, Pig, ZooKeeper, HBase, Sqoop. Not only gives a first impression of what Hadoop, it also gives a deeper knowledge about each component and related technologies. Thus, if you just want a book to rule them all, pick this one.

However, because the ambition of the author is to put all into one book, you might feel overwhelmed with many details under the hood. It should be better you just read the int
Saul Cruz
Definitely a good way to start, I'd recommend the latest version as many blocks are not being used anymore, however if you really want to understand the underlying engine, this is the book to start with, Map Reduce is a complex Model that probably you'll never tweak, however, it is very important to completely understand how this model works so that you can optimize a cluster, and if you want probably come up with a new data processing technology (i.e. there are some tools that work on top of ma ...more
Apr 01, 2015 Sam rated it really liked it  ·  review of another edition
Shelves: tech, data, safari
This is a great overview of the various tools/technologies that make up the Hadoop ecosystem. Each chapter that covers a different tool/technology is a good overview of each. Each area is quickly finding a slew of of books on each individually, but I still find this is a good place to start. With a fourth edition coming soon (available in pre-release online), it's nice to see that they're trying to keep this up to date as the technology changes.
Jan 27, 2016 Amit rated it really liked it  ·  review of another edition
Shelves: computers
This is best Hadoop book. Brief introduction of all related tools e.g. Hive/Pig/HBase/ZooKeeper/Sqoop
1. Initial 10 chapters are devoted for Hadoop.
2. Writing Map/Reduce programs using the given online reference is enough; this books is just good to understand the internals of these operations.
3. Best is to start referring Apache Hadoop developer reference along with Hadoop stand alone setup.
4. Book is helpful to get more deeper into the Hadoop Logic.
Sidhartha Ray
May 15, 2014 Sidhartha Ray is currently reading it  ·  review of another edition
I've already read the following chapters:

2nd Chapter - MapReduce:
>A good point to start different components of MapReduce program, Mapper, Reducer and all...
>Got a good dataset(weather dataset from NCDC) to play around...
>We can use Cloudera's distribution CDH4 for practicing the programs

7th Chapter - MapReduce Types and Formats:
Jun 18, 2016 Manzur rated it it was amazing  ·  review of another edition
Shelves: programming
This book is really fantastic! It's a complete reference on Hadoop ecosystem, and should be first point of contact for the person playing with Hadoop. Content and writing style is really approachable -- I wish that other technical authors are able to write on the same level as Tom White does.
Alex Ott
Good book on basics of Hadoop (HDFS, MapReduce & other related technologies). This book provides all necessary details to start work with Hadoop, program using it, administer, etc.

I actually read 1st edition as well, but I found many new & useful additions in new edition
Jan 02, 2016 Dariusz rated it liked it  ·  review of another edition
Shelves: owned, informatyka
Świetna jako przegląd technoligii związanych z Hadoopem, wyjątkowo mizerna jako źródło przykładów kodu i zastosowań (bo "200 sposobów na wyliczenie temperatury maksymalnej" to nie jest to czego oczekiwałem).
Anatoliy Kaverin
Nov 26, 2014 Anatoliy Kaverin rated it really liked it  ·  review of another edition
Best book to dive into Hadoop world.
Of course hadoop API evolves pretty fast, but I was able with minor changes to launch most of code samples.
Very handy, especially provides guidance to use local/dev mode to start immediate implementation of M/R stuff
Michael Economy
Aug 19, 2012 Michael Economy rated it really liked it  ·  review of another edition
Shelves: work-related
Pretty good summary. Hadoop and it's ecosystem are incredibly complex. I'd be terrified to deploy it without reading this book first. I guess I'm still pretty terrified, but markedly less so.

Some of the writing was a bit wonky, but overall really good.
The layout is confusing and non-intuitive. The writing often omits important points. And there is much space given over to specific technologies and not to general Hadoop understanding and programming.
Christopher Noyes
For those trying to learn hadoop, pig, hive and the like and other big data technology, it's a real useful book.
Jan 28, 2016 David rated it it was amazing  ·  review of another edition
Shelves: computers-cloud
Just about every single page of this book is useful. My copy must be heavier with all the pencil marks and notes I've made. (3rd edition)
Collin Rogowski
Sep 13, 2012 Collin Rogowski rated it it was amazing
Very thorough and easily readable introduction for the whole Hadoop ecosystem. Can be read "as is" to get an overview, but can also be used as a reference while implementing projects with Hadoop.
Harry Yeh
Dec 23, 2012 Harry Yeh rated it really liked it  ·  review of another edition
Good overview of Hadoop - Definitely a useful guide if you are looking at getting into Big Data, Map Reduce etc
Paul Childs
Sep 06, 2011 Paul Childs rated it liked it  ·  review of another edition
Shelves: computers
I found this book more helpful and detailed than the Hadoop in Action book I had read earlier. It was better at explaining the setup and the purpose of the various Hadoop services and config files.
May 14, 2013 Rob marked it as unfinished  ·  review of another edition
Recommended to Rob by: Dave Howell
Shelves: own, 2011, 2012
Picked this up as a "prize" swag item at VT Code Camp 2011. I was probably the least qualified person in the room to read it... but whatever; maybe that's why I "earned" it.
Ferouk Bouazza
May 25, 2016 Ferouk Bouazza rated it really liked it  ·  review of another edition
This is a complete guide for Hadoop ecosystem, I recommend for data scientists, researchers and developers.
Feb 27, 2013 Ivan rated it liked it  ·  review of another edition
Ideal for the quick intro. Has more facts and feature rich version of the tutorial:
Marc Donner
Marc Donner rated it liked it
May 31, 2015
Steven Maestas
Steven Maestas rated it it was amazing
Jan 06, 2017
Shawn Hermans
Shawn Hermans rated it it was amazing
Mar 10, 2015
Inquire rated it really liked it
Jan 03, 2012
John rated it really liked it
Jan 09, 2017
Paul Broenen
Paul Broenen rated it really liked it
Dec 14, 2012
« previous 1 3 4 5 6 7 8 9 next »
topics  posts  views  last activity   
Hadoop Training 1 1 Aug 15, 2016 09:54PM  
  • Hadoop Operations
  • MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems
  • Hadoop in Action
  • HBase: The Definitive Guide
  • MongoDB: The Definitive Guide
  • Mining of Massive Datasets
  • Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement
  • Beautiful Data: The Stories Behind Elegant Data Solutions (Theory In Practice, #31)
  • Learning Spark
  • Natural Language Processing with Python
  • Programming Scala: Scalability = Functional Programming + Objects
  • The Art of Multiprocessor Programming
  • Machine Learning in Action
  • Scala in Depth
  • Lucene in Action
  • RESTful Web Services
  • Data Analysis with Open Source Tools
  • Big Data

Goodreads is hiring!

If you like books and love to build cool products, we may be looking for you.
Learn more »

Share This Book

“cluster.” 0 likes
“Chapter 10. Setting Up a Hadoop Cluster” 0 likes
More quotes…