Adam’s Kindle Notes & Highlights

Data Science, by John D. Kelleher

large volumes of data at high speeds, it can be useful from a computational and speed perspective to distribute the data across multiple servers, process queries by calculating partial results of a query on each server, and then merge these results to generate the response to the query. This is the approach taken by the MapReduce framework on Hadoop. In the MapReduce framework, the data and queries are mapped onto (or distributed across) multiple servers, and the partial results calculated on each server are then reduced (merged) together.