Vish Agashe

80%
Flag icon
The central data abstraction in PySpark is a “resilient distributed dataset” (RDD), which is just a collection of python objects.
Data Science: The Executive Summary - A Technical Book for Non-Technical Professionals
Rate this book
Clear rating
Open Preview