High Performance Spark Quotes

Rate this book

Clear rating

1 of 5 stars 2 of 5 stars 3 of 5 stars 4 of 5 stars 5 of 5 stars

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark by Holden Karau
128 ratings, 3.98 average rating, 15 reviews
Open Preview

High Performance Spark Quotes Showing 1-3 of 3

“Co-partitioning is related to but distinct from partition co-location. We say that multiple RDDs are co-partitioned if they are partitioned by the same known partitioner.”
― Holden Karau, High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark

0 likes

“Co-located RDDs are RDDs with the same partitioner that reside in the same physical location in memory.”
― Holden Karau, High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark

0 likes

“Beyond being less likely to run out of memory than groupByKey, the following four functions — reduceByKey, treeAggregate, aggregateByKey, and foldByKey — are implemented to use map-side combinations, meaning that records with the same key are combined”
― Holden Karau, High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark

0 likes

All Quotes
Quotes By Holden Karau

High Performance Spark Quotes

See a Problem?

Preview — High Performance Spark by Holden Karau