Goodreads helps you keep track of books you want to read.
Start by marking “Spark: The Definitive Guide” as Want to Read:
Spark: The Definitive Guide
Enlarge cover
Rate this book
Clear rating
Open Preview

Spark: The Definitive Guide

4.50  ·  Rating details ·  18 ratings  ·  8 reviews
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of this open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals.

You’ll explore the basic operati
...more
Published October 2017 by O'Reilly Media, Inc.
More Details... Edit Details

Friend Reviews

To see what your friends thought of this book, please sign up.

Reader Q&A

To ask other readers questions about Spark, please sign up.

Be the first to ask a question about Spark

This book is not yet featured on Listopia. Add this book to your favorite list »

Community Reviews

Showing 1-50
Average rating 4.50  · 
Rating details
 ·  18 ratings  ·  8 reviews


More filters
 | 
Sort order
Start your review of Spark: The Definitive Guide
Johnny
Nov 30, 2018 rated it liked it
Shelves: software
This has to be the most poorly edited book I've ever read. some examples: there is a figure with boxes that represent two different kinds of components. the way to tell the components apart is by their shading. however the shading for all the boxes in the figure is exactly the same. there are long running code examples that could not possibly compile, and violate basic principles of Scala programming (e.g. case classes treated as mutable objects). And there are TODO style notes still present in ...more
Alex Ott
Sep 06, 2018 rated it it was amazing
Shelves: big-data
Really between 4 & 5 stars because of some discrepancies in examples, etc.

But, it's really good book about current version of Spark (2.2 & some mentions of 2.3). The book is mostly concentrated on the DataFrames, in contrast with other Spark books that mostly talking about RDDs.

A lot of useful information, including Structured Streaming, Machine learning, and even short description of GraphFrames.

Highly recommneded
Gourav Sengupta
Jan 05, 2019 rated it it was amazing
if you can use additional data sets from the internet, then this makes for brilliant reading. The examples are just introductory, therefore, using additional data sets to work out different scenarios will really benefit.
Wojtekwalczak
May 27, 2018 rated it really liked it
Shelves: big-data
Width over depth, but as an overview of Spark and its ecosystem, the book will do.
Gavin
Jul 27, 2018 rated it liked it
Shelves: cs
It's fine, covers everything shallowly. The API changes so frequently that you probably need this book: 95% of the Google hits for a given Spark feature are now either wrong or suboptimal.
Alp Oz
Nov 23, 2018 rated it it was amazing
Must have in terms of the root mechanisms of the Spark but take account that all major APIs are continuously being changed so always consider the version
Masu
rated it it was amazing
Jun 25, 2017
Conscious Dzidzi
rated it really liked it
Jul 18, 2019
Aleksandr Danshyn
rated it it was amazing
Oct 30, 2017
Ritesh Pallod
rated it it was amazing
Aug 12, 2019
Rupesh Agarwal
Jan 26, 2019 marked it as to-read
Awesome
Umut Salih
rated it it was amazing
Feb 19, 2019
Trent Baur
rated it it was amazing
Aug 18, 2018
Ali Saad
rated it it was amazing
Jul 24, 2017
Brian
rated it really liked it
Jan 01, 2019
Cristian Orellana
rated it it was amazing
Aug 04, 2019
Delhi Irc
Location: GG5 IRC, GG6 IRC, GG7 IRC, ND6 IRC
Accession No: DL029894-903
Ítalo Sayán
rated it it was amazing
Mar 19, 2018
Vahe Hakobyan
rated it really liked it
Jul 18, 2019
Danny Guinther
rated it really liked it
May 24, 2019
Arturas
marked it as to-read
Apr 07, 2017
Amit Singh
marked it as to-read
Jun 07, 2017
Otávio
marked it as to-read
Jun 10, 2017
Thomas
marked it as to-read
Jun 16, 2017
Terry Healy
marked it as to-read
Jul 13, 2017
Ahmed Ayman
marked it as to-read
Jul 27, 2017
Alex Karpov
marked it as to-read
Aug 17, 2017
Yong Lai
marked it as to-read
Sep 08, 2017
Oleg Prozorov
marked it as to-read
Oct 08, 2017
Michele
is currently reading it
Oct 14, 2017
Kirill Perevozchikov
marked it as to-read
Oct 28, 2017
Carlos Tirado
marked it as to-read
Oct 29, 2017
Luca Cavagnoli
marked it as to-read
Nov 04, 2017
异次元骇客
marked it as to-read
Jan 05, 2018
Marco
marked it as to-read
Jan 10, 2018
Alex
marked it as to-read
Feb 04, 2018
Juanmi Grau
marked it as to-read
Feb 05, 2018
Vikas Sharma
marked it as to-read
Feb 14, 2018
Jakub
marked it as to-read
Feb 22, 2018
Bugzmanov
marked it as to-read
Feb 27, 2018
Arunoag
marked it as to-read
Mar 22, 2018
Alex
marked it as to-read
Mar 27, 2018
Patrik Herrgård
marked it as to-read
Mar 28, 2018
Kusno
is currently reading it
Apr 21, 2018
Shakeel Hussain
is currently reading it
Apr 28, 2018
Gilbert
marked it as to-read
May 02, 2018
Amit Soni
marked it as to-read
May 04, 2018
There are no discussion topics on this book yet. Be the first to start one »

Readers also enjoyed

  • High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark
  • Programming in Scala
  • Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale
  • Bayesian Analysis with Python
  • Probabilistic Programming & Bayesian Methods for Hackers
  • Causal Inference in Statistics: A Primer
  • Alice's Adventures in Wonderland & Through the Looking-Glass
  • Woken Furies (Takeshi Kovacs, #3)
  • It Doesn't Have to Be Crazy at Work
  • Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
See similar books…

Goodreads is hiring!

If you like books and love to build cool products, we may be looking for you.
Learn more »