Jump to ratings and reviews
Rate this book

Transformers for Natural Language Processing: Build, train, and fine-tune deep neural network architectures for NLP with Python, Hugging Face, and OpenAI's GPT-3, ChatGPT, and GPT-4

Rate this book
OpenAI's GPT-3, ChatGPT, GPT-4 and Hugging Face transformers for language tasks in one book. Get a taste of the future of transformers, including computer vision tasks and code writing and assistance.



Purchase of the print or Kindle book includes a free eBook in PDF format

Key FeaturesImprove your productivity with OpenAI’s ChatGPT and GPT-4 from prompt engineering to creating and analyzing machine learning modelsPretrain a BERT-based model from scratch using Hugging FaceFine-tune powerful transformer models, including OpenAI's GPT-3, to learn the logic of your dataBook DescriptionTransformers are...well...transforming the world of AI. There are many platforms and models out there, but which ones best suit your needs?

Transformers for Natural Language Processing, 2nd Edition, guides you through the world of transformers, highlighting the strengths of different models and platforms, while teaching you the problem-solving skills you need to tackle model weaknesses.

You'll use Hugging Face to pretrain a RoBERTa model from scratch, from building the dataset to defining the data collator to training the model.

If you're looking to fine-tune a pretrained model, including GPT-3, then Transformers for Natural Language Processing, 2nd Edition, shows you how with step-by-step guides.

The book investigates machine translations, speech-to-text, text-to-speech, question-answering, and many more NLP tasks. It provides techniques to solve hard language problems and may even help with fake news anxiety (read chapter 13 for more details).

You'll see how cutting-edge platforms, such as OpenAI, have taken transformers beyond language into computer vision tasks and code creation using DALL-E 2, ChatGPT, and GPT-4.

By the end of this book, you'll know how transformers work and how to implement them and resolve issues like an AI detective.

What you will learnDiscover new techniques to investigate complex language problemsCompare and contrast the results of GPT-3 against T5, GPT-2, and BERT-based transformersCarry out sentiment analysis, text summarization, casual speech analysis, machine translations, and more using TensorFlow, PyTorch, and GPT-3Find out how ViT and CLIP label images (including blurry ones!) and create images from a sentence using DALL-ELearn the mechanics of advanced prompt engineering for ChatGPT and GPT-4Who this book is forIf you want to learn about and apply transformers to your natural language (and image) data, this book is for you.

You'll need a good understanding of Python and deep learning and a basic understanding of NLP to benefit most from this book. Many platforms covered in this book provide interactive user interfaces, which allow readers with a general interest in NLP and AI to follow several chapters. And don't worry if you get stuck or have questions; this book gives you direct access to our AI/ML community to help guide you on your transformers journey!

Table of ContentsWhat are Transformers?Getting Started with the Architecture of the Transformer ModelFine-Tuning BERT ModelsPretraining a RoBERTa Model from ScratchDownstream NLP Tasks with TransformersMachine Translation with the TransformerThe Rise of Suprahuman T

564 pages, Kindle Edition

Published March 25, 2022

52 people are currently reading
54 people want to read

About the author

Denis Rothman

18 books12 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
9 (37%)
4 stars
7 (29%)
3 stars
5 (20%)
2 stars
2 (8%)
1 star
1 (4%)
Displaying 1 - 5 of 5 reviews
Profile Image for Adam.
190 reviews11 followers
February 19, 2023
Good useful (though basic) info but padded with so much gratuitous philosophizing and boilerplate code that it's hard to find the morsels of actual content.
4 reviews6 followers
April 6, 2022
Worth every penny. If you are looking for an expert's take on NLP then look no further.

I've read anything I can get my hands on re: GPT-3/Huggingface. Prof Rothman has raised the ante for what should be considered acceptable discourse re: GPT-3/Transformers and the new NLP-driven world that we live in.

Buy the book; it has a feel of an experienced mentor giving you the tools but more importantly, the judgement to navigate these new waters.
Profile Image for Kyle.
27 reviews
September 24, 2024
I didn’t get much out of this.

TL;DR: Don’t follow the author’s example for how to write code. Read the seminal papers instead. A lot of the business-y angles of the book haven’t aged well. Watch 3blue1brown.

If the code examples are representative of the author’s work, this man writes cursed code. I am a software engineer who builds infrastructure, and I am aware the standard I hold my code to does not fit most science coding situations. But if the examples in this book are anything to go by, I would not trust an ML pipeline built by the author. And if he could demonstrate that it worked satisfactorily despite the state of the code, I would not want to be the poor soul who has to re-write all the author’s code later when something inevitably breaks.

His code examples often don’t go much beyond what a “ChatGPT API README” would give you. If you have any experience with Python, installing packages, and reading documentation, the code sections are often pointless. What he calls “classical coding” (IIRC) is few and far between in this book. The exception to this is the early BERT sections, if memory serves.

Again, a symptom of my line of work, but PLEASE use poetry and a pyproject.toml. This will greatly increase your reproducibility and dependency management, which is vital for ML tasks, especially when transformer models are evolving so fast. A jupyter notebook is great for poking around, but as an ML data scientist turned software engineer, my grad work would have gone a lot smoother if I focused on reproducibility through modular code design instead of “just throw it together”. I created a lot of extra work for myself, and I see those kinds of design patterns on display in several of the code examples in this textbook.

I also found the book poorly organized. It gave the impression that it was a first draft of a text. The author repeats himself a lot. On the one hand, I value: “tell the audience what you’ll tell them, tell them, then tell them what you just told them”. On the other hand, it often felt like the text didn’t flow from one chapter to the next. Nor did it feel like there was a cohesive vision.

I also chafe at any “business bro” lingo. It felt like “Industry 4.0” was the author’s hobby horse idea, and I really got bored of being reminded of it.

His opinions on where the field is going, the future role of the “AI specialist”, etc., also don’t feel like they have aged well. I recognize I don’t have much exposure to prompt engineering in an industry context. But in my experience, it has not been hard to get ChatGPT (version 4 and on) to do what I want. And if it is, it’s because I’m asking something of it that it cannot yet handle, not in a poorly-structured prompt.

ChatGPT is certainly not the only model available, but on more than one occasion the author demonstrates himself that the “cool new model” on the street, although innovative, can’t one-up the scale of data that has gone into training ChatGPT, and using ChatGPT ends up being more effective for many tasks. In my experience as an end-user, prompt engineering has only been absolutely essential when working with Stable Diffusion.

Overall, I think you’re better off reading the seminal papers that the author mentions (e.g. Attention is All You Need) and watching 3blue1brown’s transformer series on Youtube.
Profile Image for Ita Cirovic Donev.
12 reviews
September 27, 2024
I expected so much more from this book given its other reviews; however, I was deeply disappointed. At some points this book feels as if it was written by some, not that great, LLM model given the structure, choice, and sequence of sentences. Many sentences are repeated in only a slightly different way, as if there is no editor.

The topics are covered marginally, where plenty of space is taken for extra large images rather than words. Wording in some chapters is repetitive as if the chapter were written ages apart and the author forgot what it was said in previous chapters.

Very hand-wavy explanation of the transformer architecture. The author tries to have a structured flow however, the concepts are explained sort of in a haste and without proper attention to detail. The examples are overly simplistic contradictory to the level of usefulness. No notation is presented and how to implement this for a more complex case.
The author presents this as the novice is reading, but I can not imagine a novice would understand this flow of explanation and examples.

Generally it provides the basic examples and how to run the code, without going into interpretative details.
10 reviews
March 13, 2023
Has useful information but examples are difficult to get to run as versions are out of date. I guess such a fast moving field.
Displaying 1 - 5 of 5 reviews

Can't find what you're looking for?

Get help and learn more about the design.