As telescopes, detectors, and computers grow ever more powerful, the volume of data at the disposal of astronomers and astrophysicists will enter the petabyte domain, providing accurate measurements for billions of celestial objects. This book provides a comprehensive and accessible introduction to the cutting-edge statistical methods needed to efficiently analyze complex data sets from astronomical surveys such as the Panoramic Survey Telescope and Rapid Response System, the Dark Energy Survey, and the upcoming Large Synoptic Survey Telescope. It serves as a practical handbook for graduate students and advanced undergraduates in physics and astronomy, and as an indispensable reference for researchers.
"Statistics, Data Mining, and Machine Learning in Astronomy" presents a wealth of practical analysis problems, evaluates techniques for solving them, and explains how to use various approaches for different types and sizes of data sets. For all applications described in the book, Python code and example data sets are provided. The supporting data sets have been carefully selected from contemporary astronomical surveys (for example, the Sloan Digital Sky Survey) and are easy to download and use. The accompanying Python code is publicly available, well documented, and follows uniform coding standards. Together, the data sets and code enable readers to reproduce all the figures and examples, evaluate the methods, and adapt them to their own fields of interest. Describes the most useful statistical and data-mining methods for extracting knowledge from huge and complex astronomical data sets Features real-world data sets from contemporary astronomical surveys Uses a freely available Python codebase throughout Ideal for students and working astronomers
Excellent book. The book comes with a companion website. Here you can find the source code for every single textbook figure. How cool is that! There is an active GitHub repo for all the textbook errata too. Highly recommended for 1) All astronomy graduate students and postdocs, 2) anyone interested in data intensive applications, 3) Professors teaching a modern astronomy methods course. This is simply a must have book for 2015.
Pretty good balance of depth and breadth. Having the AstroML code be open source is great. I wish there was more on neural networks (e.g., CNNs, GANs), which are mentioned in the book as if they were explained, but actually are not touched upon. Also there is a lot of mention of cross validation before the term is ever defined. Overall a very solid stats and ML reference.
Practical indeed, contains intuitive explanations of (astro)statistical analysis and data mining techniques while presenting a wealth of tips about trade-offs in practice.