Jump to ratings and reviews
Rate this book

Parallel and High Performance Computing

Rate this book
Parallel and High Performance Computing offers techniques guaranteed to boost your code’s effectiveness.

Summary
Complex calculations, like training deep learning models or running large-scale simulations, can take an extremely long time. Efficient parallel programming can save hours—or even days—of computing time. Parallel and High Performance Computing shows you how to deliver faster run-times, greater scalability, and increased energy efficiency to your programs by mastering parallel techniques for multicore processor and GPU hardware.

About the technology
Write fast, powerful, energy efficient programs that scale to tackle huge volumes of data. Using parallel programming, your code spreads data processing tasks across multiple CPUs for radically better performance. With a little help, you can create software that maximizes both speed and efficiency.

About the book
Parallel and High Performance Computing offers techniques guaranteed to boost your code’s effectiveness. You’ll learn to evaluate hardware architectures and work with industry standard tools such as OpenMP and MPI. You’ll master the data structures and algorithms best suited for high performance computing and learn techniques that save energy on handheld devices. You’ll even run a massive tsunami simulation across a bank of GPUs.

What's inside

Planning a new parallel project
Understanding differences in CPU and GPU architecture
Addressing underperforming kernels and loops
Managing applications with batch scheduling

About the reader
For experienced programmers proficient with a high-performance computing language like C, C++, or Fortran.

About the author
Robert Robey works at Los Alamos National Laboratory and has been active in the field of parallel computing for over 30 years. Yuliana Zamora is currently a PhD student and Siebel Scholar at the University of Chicago, and has lectured on programming modern hardware at numerous national conferences.

Table of Contents
PART 1 INTRODUCTION TO PARALLEL COMPUTING
1 Why parallel computing?
2 Planning for parallelization
3 Performance limits and profiling
4 Data design and performance models
5 Parallel algorithms and patterns
PART 2 THE PARALLEL WORKHORSE
6 FLOPs for free
7 OpenMP that performs
8 The parallel backbone
PART 3 BUILT TO ACCELERATE
9 GPU architectures and concepts
10 GPU programming model
11 Directive-based GPU programming
12 GPU Getting down to basics
13 GPU profiling and tools
PART 4 HIGH PERFORMANCE COMPUTING ECOSYSTEMS
14 Truce with the kernel
15 Batch Bringing order to chaos
16 File operations for a parallel world
17 Tools and resources for better code

704 pages, Paperback

Published June 22, 2021

16 people are currently reading
62 people want to read

About the author

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
2 (22%)
4 stars
5 (55%)
3 stars
1 (11%)
2 stars
1 (11%)
1 star
0 (0%)
Displaying 1 - 2 of 2 reviews
113 reviews8 followers
September 14, 2025
As a general backend/systems software engineer with an interest in performance topics, I guess I wasn't the target audience for this book. It is very much for and by scientific computing people. It is interesting as a study in how scientific computing people think, I suppose.

The book is mired in specifics. The first couple chapters are around general approaches to parallelization projects and hardware measurement and are alright, good even, though they do go heavy on enumerating particular tools for measuring your hardware. The datastructures chapter is fine and makes some good points about cache behavior of particular layouts. I had read Data-oriented design: software engineering for limited resources and short schedules previously and this chapter's concreteness and specifics were a welcome complement to Fabian's more abstract, software engineer's descriptions.

The parallel algorithms chapter is a disaster, though, much-belaboring spatial hashing (which I would hesitate to even call hashing - but it figures prominently in the work of one of the authors) at the expense of a clear explanation of how the typical/good solution to prefix-sum works. Even the diagram accompanying that explanation is unclear, with identical arrows representing different operations and leaving the reader to stare at it and figure it out. And that's about all that's in the parallel algorithms chapter - spatial hashing and prefix sum (there was also a long section on five or so approaches for correcting for lack of associativity when adding floats, but I forget whether it was in the data structures or algorithms chapter). I don't know much about parallel algorithms, which was part of why I was reading the darn book, but I think there might be one or two more? Maybe the takeaway here is really that doing high-performance computing is about using brutally-simple datastructures and algorithms, and consequently there really isn't much to say. Although some of these applications of spatial hashing really could've used notes about expected preconditions; some of them don't seem to be correct for edge-case inputs. Maybe sacrificing correctness for speed is also unremarkable in HPC; it wouldn't be totally unreasonable, as an approximate answer today is better than a perfect answer never if the computations to do it right would take too long. But this would be a strange contrast with the emphasis on correcting for small errors due to lack of associativity in floating point math.

After algorithms, the book moves into specific approaches to parallelism and becomes dominated by concrete examples from the authors' research. The initial explanations of problems and approaches are mostly unclear (likely due in part to expert-blindness), leaving the example solutions to carry the text. I think this is backwards; explanations should be general, and then concrete examples should help provide clarity. But here a lack of ability to explain in general terms has left examples as the primary tool from which readers must inductively reason out general principles. I estimate that about 30% of the total page area of the book is code snippets (mostly C, but some fortran and C++). The code itself is repetitive and not very interesting.

This is typical of the authors' approach to generality - it consists primarily of enumerating examples, not extracting insights which generalize across many examples. The GPU chapter is bogged down by enumerating the terminology used by each GPU vendor, rather than picking one set of terms and sticking to it. Sample code for each of many parallelism frameworks, or a big table of vectorization-related flags for each of many compilers, seems less useful to me than discussion of problems and approaches that cut across frameworks. Being so tightly-bound to particular current frameworks for parallelism also suggests to me that this book won't age very well. But maybe HPC frameworks change more slowly than I would expect. It is an awkward contrast; not trusting readers to be able to generalize to a new framework or new set of terms, but requiring them to generalize approaches and principles from examples.

There are insights to be had here, like building simple a priori models of expected performance behavior of a system, benchmarking your hardware, and plugging those numbers in. This is not something that would have ever occurred to me do to across my very heterogeneous fleet at work, but it might be relevant for personal projects at home. Spatial hashing is a neat trick, and there were some good notes on cache behavior, ghost cells, thread divergence, and dense representations of sparse 2-d arrays. But you really have to dig the good bits out here.

In conclusion: this might be a fine and useful book to bring with you into your job in a Department of Energy SCIF where you don't have access to the internet to read the manuals for the frameworks you're using. Not a book to read for lasting insights unless you want to spend a lot of work extracting them from the text.
119 reviews4 followers
December 23, 2024
Probably my favorite professional book of the year. A lot of practical information, excellent coverage of the subject.
Displaying 1 - 2 of 2 reviews

Can't find what you're looking for?

Get help and learn more about the design.