Allen Downey is a professor of Computer Science at Olin College and the author of a series of open-source textbooks related to software and data science, including Think Python, Think Bayes, and Think Complexity, which are also published by O’Reilly Media. His blog, Probably Overthinking It, features articles on Bayesian probability and statistics. He holds a Ph.D. in computer science from U.C. Berkeley, and M.S. and B.S. degrees from MIT. He lives near Boston, MA with his wife and two daughters.
Science has been described as simply “a collection of successful recipes”. In “Think Bayes” Allen B. Downey has attempted just that by presenting a set of instructional tutorials for teaching bayesian methods with Python. In essence it’s an instructional book with examples that are meant to be straightforward by giving you a simple set of rules in solving more complex sets of problems. The book also makes a few style choices, ignoring continuous distributions in an effort to focus on discrete distributions which makes the math more straightforward. Successful recipes need not be complex in every instance, which this book illustrates effectively.
An important caveat for this book is that it is supplemental material to teaching statistics. Instead of using mathematical notation like many other statistics books it sticks to using python code, for the most part, because its main goal is to construct an applied mathematics educational book. In a sense, this book is a customization of basic mathematical principles to meet the needs of programmers who wish to do statistics, not statisticians wanting to do coding. As someone who has struggled with the mathematical notation of statistics this book presented itself as a clear and dynamic guide into bayesian statistics, starting off very well on page 8 with an astute example of the Monty Hall problem working with python code. The Monty Hall problem being the famous example of conditional probabilities that it is, Allen Downey uses it as a backbone of the book bringing it back up with examples such as Cookie2.py which he makes available on his website freely, as he does the book itself in PDF form.
In thinking through the problem p(D|H), where D is the probability that Monty chooses a door which has no car and H being the probability that Monty in fact does choose the car, I was able to use insights from the book to further my understanding of the mathematical truth of why we should always “switch”. In a narrow window of three likely probabilities it’s much harder for us to see that when Monty opens the first door to show us a zonk that the probability of our potential car will be ⅔ likely to be behind the third door. If we widen that window to a thousand doors and we again choose the first door for monty to open and Monty opens nine hundred and ninety nine doors to show us nine hundred and ninety nine zonks, we can almost feel the weight of the probability of D is true and the probability of H is true funneling into that thousandth door showing us that we should indeed switch.
In chapter 4, the “Euro” problem is explored, which asked is it likely that when spun on edge 250 times, a Belgian one-euro coin came up heads 140 times and tails 110. In this example, Downey exposed the concept of “swamping the priors”, which states: with enough data, people who start with different priors will tend to converge to the same posterior. Even with substantially different priors, Downey shows that the posterior distributions are very similar. The medians and the credible intervals are identical; the means differ by less than 0.5%, in this example, which was a highlight for me in particular because it showed with enough data reasonable people converge.
In conclusion Think Bayes creates opportunities for learning subject matter that would enable you not only to know, but to learn to use what you know in the varied contexts of statistics. It’s a divide and conquer strategy which pairs well with bayesian statistics. On the math side if I go back to the Monty Hall problem as Downey so often does himself, if P(D|H) is harder to deal with than [P(H|D)P(D)]/P(H) , and [P(H|D)P(D)]P(H) is possible then you have an algorithm on your hands. The examples in this book tend to break after four to five dimensions, as they are meant to be instructions of one to two dimensional problems. For that reason it’s a great introduction to bayesian statistics, for more insight it’s recommended to learn a more Markov Chain Monte Carlo approach and to use this book as a supplement into those more complex concepts.
very good Bayesian introduction, specially because it's light on mathematics and full of practical content. I searched for this kind of content for a long time, but was surprised to find in a book like this.
This book is great in term of providing wide range of examples and exercises by those we can understand more about how to "think bayes". However, there are still lacks of detail explaination , and mixture of python code and math is not making it easier to understand.
Stats education is kind of a mess, or at least it was back when I was a grad student. Every branch of science needs statistics on a fundamental level, but exactly what gets filtered down to the next generation of students varies widely from field to field, each of which has their own lore, notation, and epistemological schisms. I got some formal education in various theorems and whatnot, but when presented with actual research data -- when I actually had to DO something with real data -- I mostly just learned the ropes from colleagues and advisors.
Anyway. Think Bayes is probably not the first or only book you need on Bayesian methods, but it does fill in a an important niche. It's not a heavy formal treatise with proofs and theorems and specialty distributions, but neither is it just someone throwing you the manual for R or Matlab. The book explains the basic method and intuition of Bayesian statistics (prior > likelihood > posterior > repeat) by building up models from scratch using the tools of modern Python (i.e. numpy / scipy / pandas etc). It starts simple and works its way up to useful modern tools like MCMC, and has enough variety that if you're faced with a new problem, there's probably a good, practical jumping off point to be found here.
While the methodology behind the framework of the code examples wasn't always obvious (and seemed occasionally overwrought), I think the core statistical concepts come through clearly enough that they could be reimplemented in whatever fashion made most sense to the reader. Generally fairly concise, and generous with graphical outputs as well, which helped solidify conceptual aspects of distributions and their properties.
Like the applied ethos and no-nonsense fun example problems; a somewhat casual style is also refreshing. However I think the author went overboard; using Python is redundant and simplistic; and a few mathematical expressions would not have hurt anyone.
Good and practical introduction into Bayesian Statistics using Python. While it won't really teach you how to think Bayes, it offers a number of good and practical examples with good discussion.
Interesting examples and a nice overview of Bayesian modelling. The undocumented python code snippets and lack of mathematical rigour make it hard to use as a reference.
The second edition of this book is updated for Python 3, mostly apply the PyData stack such as NumPy, Pandas, Matplotlib etc. while not relying on higher abstractions such as Pymc3 or Pyro.
Good introductory book with interesting example problems. The example code layers abstractions on top of the previously introduced ones from chapter to chapter. Over time it gets hard to comprehend the examples due to class-based polymorphism with multiple levels of inheritance. If not this annoyance, it would be a great book.
Good introduction to Bayesian analysis. I didn't take the time this time through to do all of the code samples and exercises, but I still got a decent overview. One of the best parts was the first really good explanation of the Monty Hall problem that I've seen; I finally understand it!
I'm not giving this up because I didn't find it interesting. I'm putting it on hold because there are some technical books that I need to read first (for work purposes.)