This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs) . DNNs are currently widely used for many artificial intelligence (AI) applications, including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Therefore, techniques that enable efficient processing of deep neural networks to improve metrics-such as energy-efficiency, throughput, and latency-without sacrificing accuracy or increasing hardware costs are critical to enabling the wide deployment of DNNs in AI systems. The book includes background on DNN processing; a description and taxonomy of hardware architectural approaches for designing DNN accelerators; key metrics for evaluating and comparing different designs; features of the DNN processing that are amenable to hardware/algorithm co-design to improve energy efficiency and throughput; and opportunities for applying new technologies. Readers will find a structured introduction to the field as well as a formalization and organization of key concepts from contemporary works that provides insights that may spark new ideas.
Despite not having a computer architecture background, I was able to get through most of it(with the help of ChatGPT by my side), and it was a pretty digestible introduction to this area.
There were parts where I got lost(most of the advanced technologies chapter went over my head, and a significant portion of the exploiting sparsity chapter as well), but for such a technical book, that’s to be expected.
Chapters 4-7 were very nice(I really liked the section about the EyeExam framework), but still a bit dense. Only reason I’m giving 4 stars and not 5 is that the tone wasn’t very conversational, but I could be feeling that way due to my unfamiliarity with the subject matter.