Allen Downey is a Professor Emeritus at Olin College and the author of a series of freetextbooks related to software and data science, including Think Python, Think Bayes, and Think Complexity, which are also published by O’Reilly Media. His blog, Probably Overthinking It, features articles on Bayesian probability and statistics. He holds a Ph.D. in computer science from U.C. Berkeley, and M.S. and B.S. degrees from MIT.
I have to admit I did not do the web crawler project as I was just interested in the concepts. A novice might find the exercises helpful, but I didn’t. A very sad decision made by the author was to focus a lot of the content on tools specific for the web crawling project. I enjoyed the first chapters and skimmed the last few ones.