A detailed and up-to-date introduction to machine learning, presented through the unifying lens of probabilistic modeling and Bayesian decision theory.
This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory. The book covers mathematical background (including linear algebra and optimization), basic supervised learning (including linear and logistic regression and deep neural networks), as well as more advanced topics (including transfer learning and unsupervised learning). End-of-chapter exercises allow students to apply what they have learned, and an appendix covers notation.
Probabilistic Machine Learning grew out of the author's 2012 book, Machine Learning: A Probabilistic Perspective. More than just a simple update, this is a completely new book that reflects the dramatic developments in the field since 2012, most notably deep learning. In addition, the new book is accompanied by online Python code, using libraries such as scikit-learn, JAX, PyTorch, and Tensorflow, which can be used to reproduce nearly all the figures; this code can be run inside a web browser using cloud-based notebooks, and provides a practical complement to the theoretical topics discussed in the book. This introductory text will be followed by a sequel that covers more advanced topics, taking the same probabilistic approach.
Kevin P. Murphy is a Research Scientist at Google. Previously, he was Associate Professor of Computer Science and Statistics at the University of British Columbia.
It would be fair to say that if you understand most of the material in this volume, you are almost at the level of a graduate student in the field of machine learning, and reading current arXiv or JMLR papers should be viable for you. Having said that, I must also add that this volume, at least in some chapters, is rather wide than deep, in the sense that there are textbooks dedicated to topics which this book spends only a paragraph or a few pages. I find this understandable, otherwise the author would need to write and publish at least four volumes, instead of two.
Writing such a comprehensive book in the field of machine learning and artificial intelligence (undergoing a massive boom) is no easy task, and needless to say, the author deserves all of the praise. But I must add that having different authors write some of the chapters, no matter how good they are in their respective sub-fields, lead to an unbalanced 'voice', if I may say so. You can see that parts of the book are rushed, not every chapter goes into adequate details, some of the chapters do not contain any exercises, etc. On the other hand, some chapters include a balanced set of exercises to stretch your understanding and make connections to relevant fields, e.g. I liked the exercise about the newsvendor problem among others. Nevertheless, I'm sure the author could have easily done a better pedagogical job had he spent more time, dealing with forward references, redundancies, etc.
There's also an accidental Easter egg of sorts, where the author wrote such and such figure is the one similar to the figures on the book cover, but he must have either copy pasted from the previous edition of ten years ago, or referred to the second volume. ;)
Also, my complaints to MIT Press: you should have done better editing and proofreading, the author and the book deserves that much.
Do I recommend this book? Well, if you want to have a realistic picture of what it'll take to be a researcher in modern machine learning, then you can't go wrong with this volume. On top of that, I definitely want to have my hands on the recently published second volume, because like a movie trailer, there are a lot of forward references to it in this first volume, creating suspense and excitement!