Using data from one season of NBA games, Basketball Data With Applications in R is the perfect book for anyone interested in learning and applying data analytics in basketball. Whether assessing the spatial performance of an NBA player's shots or doing an analysis of the impact of high pressure game situations on the probability of scoring, this book discusses a variety of case studies and hands-on examples using a custom R package. The codes are supplied so readers can reproduce the analyses themselves or create their own. Assuming a basic statistical knowledge, Basketball Data Science with R is suitable for students, technicians, coaches, data analysts and applied researchers. Features :
As stated in the foreword by reknown coach Ettore Messina, statistics and data science can help basketball team staff members to take decision, but they cannot replace the human factors. This important fact is acknowledged by the authors and reminded to the readers several times in the book.
The first chapter is an introduction to data science. Of course, readers are supposed to be a little knowledgeable in statistics before reading this book, as well as knowing the game of basketball, even there are reminders throughout the pages.
Chapter 2 presents common statistics related to the game and basketball players such as pace, ratings and the four factors. Then, common graphical tools, or plots, are described. Toward the end of the chapter, variability analysis and inequality analysis are introduced (or reminded) to the readers.
The second part of the book is divided in three chapters. The third chapter reminds statistical concepts such as dependence, correlation, and shows how they can be used to find pattern between variables. Chapter 4 is dedicated to find groups of related data which is called data-clustering in data science. K-clustering and agglomerative hierarchical clustering are described too. As an example, the writers show how data science can be used to group players by 13 types (offensive ball-handler, paint protector, 3-point rebounder, etc...) rather than the 5 traditional positions (point guard, small forward, center, etc...). The fifth chapter focus on modeling relationships between data with linear models or non-parametric regressions. Each chapter is illustrated with plots and examples taken from basketball teams and players (mainly NBA).
The last part is composed of a single chapter which explains the R-package used by the authors, and how to use it. But as mentioned by the writers, there are a lot of packages and tools available on the Internet. Readers interested in doing the same task on their own set of data should rather re-use these tools with a few changes if necessary, rather than redeveloped everything from scratch and re-invent the wheel.
Eventually, this book is a clear introduction and a good overview of data science applied to basketball. I would recommend it to basketball fans as data science is taking a more and more important role in professional sports and the media is giving fans more and more (sometimes too much) statistics. However, this book does not go deep into details.