Jump to ratings and reviews
Rate this book

Modern Computer Vision with PyTorch: A practical roadmap from deep learning fundamentals to advanced applications and Generative AI

Rate this book
The definitive book on computer vision is back and updated with the latest machine learning architecture, including 70+ pages on diffusion models

Purchase of the print or Kindle book includes a free eBook in PDF format.

Key FeaturesUnderstand the inner workings of neural network architectures and their implementation, including transformersBuild solutions to real-world computer vision applications using PyTorchGet to grips with CLIP and stable diffusion, and test their applications, such as in- and out-paintingBook DescriptionThe second edition of Modern Computer Vision with PyTorch is fully updated on top of the comprehensive coverage in the first edition to explain and provide practical examples of the latest multimodal models, CLIP and Stable Diffusion.

Whether you’re a beginner or are looking to progress in your computer vision career, this book guides you through the fundamentals of neural networks (NNs) and PyTorch and shows you how to implement state-of-the-art architectures for real-world examples. You’ll discover the best practices for working with images, tweaking hyperparameters, and moving models into production. As you progress, you'll implement multiple use cases of 2D and 3D multi-object detection, segmentation, and human pose detection by learning about the R-CNN family, SSD, YOLO, U-Net architectures, and the Detectron2 platform. You’ll enter the world of generative AI, with facial generation and manipulation, and discover the impressive capabilities of diffusion models with image creation and in- and out-painting. Finally, you'll move your NN model to production on the AWS Cloud.

By the end, you'll be able to leverage modern NN architectures to solve over 30 real-world CV problems confidently.

What you will learnTrain a NN from scratch with NumPy and PyTorchImplement 2D and 3D multi-object detection and segmentationImplement few-shot and zero-shot learning for vision tasksCombine CV with NLP to perform OCR, image captioning, and object detectionCombine CV with reinforcement learning to build agents that play pong and self-drive a carManipulate images using CycleGAN, Pix2PixGAN, StyleGAN2, and SRGANLearn about and implement diffusion models to harness the power of multimodal generative AIDiscover the benefits of diffusion models over GANsWho This Book Is ForThis book is for beginners to PyTorch and intermediate-level machine learning practitioners who want to master computer vision techniques using deep learning and PyTorch. It's especially useful for those who are just getting started with neural networks, as it will enable you to learn from real-world use cases accompanied by notebooks in GitHub. Basic knowledge of the Python programming language and machine learning is all you need to get started with this book. For more experienced computer vision scientists, this book takes you through more advanced models from chapter 8 onward.

Table of ContentsArtificial Neural Network FundamentalsPyTorch FundamentalsBuilding a Deep Neural Network with PyTorchIntroducing Convolutional Neural NetworksTransfer Learning for Image ClassificationPractical Aspects of Image ClassificationBasics of Object DetectionAdvanced Object DetectionImage Segmentation

1299 pages, Kindle Edition

Published June 10, 2024

3 people are currently reading
1 person want to read

About the author

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
3 (75%)
4 stars
0 (0%)
3 stars
1 (25%)
2 stars
0 (0%)
1 star
0 (0%)
Displaying 1 of 1 review
Displaying 1 of 1 review

Can't find what you're looking for?

Get help and learn more about the design.