This textbook establishes a theoretical framework for understanding deep learning models of practical relevance. With an approach that borrows from theoretical physics, Roberts and Yaida provide clear and pedagogical explanations of how realistic deep neural networks actually work. To make results from the theoretical forefront accessible, the authors eschew the subject's traditional emphasis on intimidating formality without sacrificing accuracy. Straightforward and approachable, this volume balances detailed first-principle derivations of novel results with insight and intuition for theorists and practitioners alike. This self-contained textbook is ideal for students and researchers interested in artificial intelligence with minimal prerequisites of linear algebra, calculus, and informal probability theory, and it can easily fill a semester-long course on deep learning theory. For the first time, the exciting practical advances in modern artificial intelligence capabilities can be matched with a set of effective principles, providing a timeless blueprint for theoretical research in deep learning.
Dan Roberts is currently a Research Affiliate at the Center for Theoretical Physics at MIT, an Affiliate of the NSF AI Institute for Artificial Intelligence and Fundamental Interactions, and a Principal Researcher at Salesforce. Previously, he was Co-Founder and CTO of Diffeo, a collaborative AI company acquired by Salesforce, a research scientist at Facebook AI Research (FAIR) in NYC, and a Member of the School of Natural Sciences at the Institute for Advanced Study in Princeton, NJ. Dan received a Ph.D. from MIT, funded by a Hertz Foundation Fellowship and the NDSEG, and he studied at Cambridge and Oxford as a Marshall Scholar.
Dan's research has centered on the interplay between physics and computation, and previously he has focused on the relationship between black holes, quantum chaos, computational complexity, randomness, and how the laws of physics are related to fundamental limits of computation.
Excellent formulation of practical deep learning in terms of statistical field theory, though it does get bogged down in trying to show every step. Tedious at times, but excellent insights