Conceptual fundamentals and practical guidance from industry experts to pretrain the large vision and language models of the future. Large models have forever changed machine learning. From BERT to GPT-3, Vision Transformers to DALL-E, when billions of parameters are combined with large datasets and hundreds to thousands of GPUs, the result is nothing short of record-breaking. The recommendations, advice, and code samples in this book will help you pretrain your large models from scratch on AWS and Amazon SageMaker and apply them to hundreds of use cases across your organization. With advice from seasoned AWS ML expert Emily Webber, this book provides everything you need to go from project ideation, dataset preparation, training, evaluation, and deployment for large language, vision, and multimodal models. With step-by-step explanations of essential concepts and practical examples, you’ll go all the way from mastering the concept of pretraining itself to preparing your dataset and model, configuring your environment, training, evaluating, and deploying your models. From applying the scaling laws to distributing your model and dataset over multiple GPUs, you’ll learn how to successfully train, evaluate, and deploy your model on Amazon SageMaker. By the end of this book, you will have everything you need to embark on your own project to pretrain the large language models of the future, purpose-built for your organization. If you’re a machine learning enthusiast or researcher who wants to get started on your very own large modeling project, this book is for you. Applied scientists, data scientists, machine learning engineers, solution architects, product managers, and students will all enjoy the material. Basic Python is a must, and introductory concepts around cloud computing will be very helpful. We’ll assume some level of deep learning fundamentals but will explain advanced topics.
I did almost 7 hrs 🤯 of reading Emily Webber’s book with a great Bosphorus view 🤗. 2 reasons why the book is excellent:
1-Breadth 🛫: Provides super distilled info for gen AI enthusiasts who don’t have time to read 30+ papers 2-Depth ⛏️: Contains advice from a seasoned practitioner not only about what & how, but also why.