Goodreads helps you keep track of books you want to read.
Start by marking “Hadoop: The Definitive Guide” as Want to Read:
Hadoop: The Definitive Guide
Enlarge cover
Rate this book
Clear rating
Open Preview

Hadoop: The Definitive Guide

3.94 of 5 stars 3.94  ·  rating details  ·  363 ratings  ·  34 reviews

Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing datasets of any size, and administrators will

Paperback, 2nd edition, 624 pages
Published October 12th 2010 by Yahoo Press (first published May 1st 2009)
more details... edit details

Friend Reviews

To see what your friends thought of this book, please sign up.

Reader Q&A

To ask other readers questions about Hadoop, please sign up.

Be the first to ask a question about Hadoop

This book is not yet featured on Listopia. Add this book to your favorite list »

Community Reviews

(showing 1-30 of 1,129)
filter  |  sort: default (?)  |  rating details
May 28, 2012 Ahmed marked it as to-read
Shelves: programming
i got really interested about Hadoop, that is why i started reading this book :), there are only 3 books about Hadoop, and from reviews i read looks like this one is the best.
Veselin Nikolov
Обяснява някои концепции на NoSQL, както и идеологията на Hadoop, като на места навлиза в детайли отвъд моите интереси.

Ако я сравня с "Hadoop Pro", която въобще не ми беше полезна, тази трябва да има 5 звездички. Въпреки това има известни пропуски, примерно няма информация за Hive, а тази за HBase e ограничена.

За целите на дипломната ми работа и първоначално запознаване с технологията, книгата е повече от достатъчна, а и още няма алтернатива.
Alex Ott
Very good book, that allows to get high level overview of Hadoop, and related projects, together with description of other Hadoop-related projects - Pig, HBase, and other.
I'll recommend this book for all developers, who want to learn about Hadoop, it's usage and programming for it
Miêu Tặc
Jan 17, 2015 Miêu Tặc rated it 5 of 5 stars  ·  review of another edition
Recommends it for: software engineer, software architect, developer
Shelves: technology
The book opens the door to Hadoop world and guides you to major places such as HDFS, Map Reduce, Hive, Pig, ZooKeeper, HBase, Sqoop. Not only gives a first impression of what Hadoop, it also gives a deeper knowledge about each component and related technologies. Thus, if you just want a book to rule them all, pick this one.

However, because the ambition of the author is to put all into one book, you might feel overwhelmed with many details under the hood. It should be better you just read the int
Todd N
This is the single best reference guide to Hadoop and related projects, and it's the only O'Reilly book I have read cover to cover.

Here is the way I recommend reading it: Read through the first two chapters including the tutorial walk through with the weather examples, then jump ahead and read the introduction for each of the related projects Pig (chapter 11), Hive (12), HBase (13), Zookeeper (14), Sqoop (15). Then read the case studies in the last chapter. Then go back and read about Hadoop in
Sidhartha Ray
May 15, 2014 Sidhartha Ray is currently reading it  ·  review of another edition
I've already read the following chapters:

2nd Chapter - MapReduce:
>A good point to start different components of MapReduce program, Mapper, Reducer and all...
>Got a good dataset(weather dataset from NCDC) to play around...
>We can use Cloudera's distribution CDH4 for practicing the programs

7th Chapter - MapReduce Types and Formats:
Anatoliy Kaverin
Best book to dive into Hadoop world.
Of course hadoop API evolves pretty fast, but I was able with minor changes to launch most of code samples.
Very handy, especially provides guidance to use local/dev mode to start immediate implementation of M/R stuff
Franklin Colorado
This seemed like a good book, but not written in much of an order.
This is a great overview of the various tools/technologies that make up the Hadoop ecosystem. Each chapter that covers a different tool/technology is a good overview of each. Each area is quickly finding a slew of of books on each individually, but I still find this is a good place to start. With a fourth edition coming soon (available in pre-release online), it's nice to see that they're trying to keep this up to date as the technology changes.
Martijn Onderwater
Interesting and broad introduction into Hadoop and various related tools. Definitely worth reading!
Anton Kalyaev
Книга написана простым языком, иллюстрации добротные - в общем книга понравилась
Alex Ott
Good book on basics of Hadoop (HDFS, MapReduce & other related technologies). This book provides all necessary details to start work with Hadoop, program using it, administer, etc.

I actually read 1st edition as well, but I found many new & useful additions in new edition
Ambarish Hazarnis
A good book if you want to get started hands on with Hadoop.
Michael Economy
Pretty good summary. Hadoop and it's ecosystem are incredibly complex. I'd be terrified to deploy it without reading this book first. I guess I'm still pretty terrified, but markedly less so.

Some of the writing was a bit wonky, but overall really good.
The layout is confusing and non-intuitive. The writing often omits important points. And there is much space given over to specific technologies and not to general Hadoop understanding and programming.
Paul Childs
I found this book more helpful and detailed than the Hadoop in Action book I had read earlier. It was better at explaining the setup and the purpose of the various Hadoop services and config files.
Collin Rogowski
Very thorough and easily readable introduction for the whole Hadoop ecosystem. Can be read "as is" to get an overview, but can also be used as a reference while implementing projects with Hadoop.
May 14, 2013 Rob marked it as unfinished
Recommended to Rob by: Dave Howell
Shelves: own, 2011, 2012
Picked this up as a "prize" swag item at VT Code Camp 2011. I was probably the least qualified person in the room to read it... but whatever; maybe that's why I "earned" it.
This book is more developer oriented. Still, a fine thing to read for Hadoop beginners.
Just about every single page of this book is useful. My copy must be heavier with all the pencil marks and notes I've made. (3rd edition)
Ideal for the quick intro. Has more facts and feature rich version of the tutorial:
Jun 09, 2012 Dan is currently reading it  ·  review of another edition
I've only just started, but so far this is quite interesting. I would love to put it to work on my web log files and anything else I can think of.
Harry Yeh
Good overview of Hadoop - Definitely a useful guide if you are looking at getting into Big Data, Map Reduce etc
Christopher Noyes
For those trying to learn hadoop, pig, hive and the like and other big data technology, it's a real useful book.
Detailed and clear introduction to Hadoop. My main source for getting started with Hadoop.
Good book to understand hadoop and mapreduce. As well as hbase, hive and zookeeper.
the best book for any newbie who wants to get started in the world of hadoop
Paco Nathan
check out the chapter in the appendix about Cascading :)
Ravi Kumar
Jun 20, 2013 Ravi Kumar is currently reading it  ·  review of another edition
Started reading it today (2oth June)... #in
Giovanni Pelosi
i'd liked a flume chapter too ...
« previous 1 3 4 5 6 7 8 9 37 38 next »
There are no discussion topics on this book yet. Be the first to start one »
  • Hadoop in Action
  • MongoDB: The Definitive Guide
  • MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems
  • Beautiful Data: The Stories Behind Elegant Data Solutions (Theory In Practice, #31)
  • Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement
  • Lucene in Action
  • RESTful Web Services
  • Machine Learning in Action
  • Programming Scala: Scalability = Functional Programming + Objects
  • Data Analysis with Open Source Tools
  • Mining of Massive Datasets
  • Big Data
  • Natural Language Processing with Python
  • Version Control with Git
  • The Art of Multiprocessor Programming
  • Clojure Programming
  • sed & awk
  • Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann Series in Data Management Systems)

Goodreads is hiring!

If you like books and love to build cool products, we may be looking for you.
Learn more »

Share This Book