AI has acquired startling new language capabilities in just the past few years. Driven by the rapid advances in deep learning, language AI systems are able to write and understand text better than ever before. This trend enables the rise of new features, products, and entire industries. With this book, Python developers will learn the practical tools and concepts they need to use these capabilities today.
You'll learn how to use the power of pre-trained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; build systems that classify and cluster text to enable scalable understanding of large amounts of text documents; and use existing libraries and pre-trained models for text classification, search, and clusterings.
This book also shows you how
Build advanced LLM pipelines to cluster text documents and explore the topics they belong toBuild semantic search engines that go beyond keyword search with methods like dense retrieval and rerankersLearn various use cases where these models can provide valueUnderstand the architecture of underlying Transformer models like BERT and GPTGet a deeper understanding of how LLMs are trainedUnderstanding how different methods of fine-tuning optimize LLMs for specific applications (generative model fine-tuning, contrastive fine-tuning, in-context learning, etc.)
Needless to say, the field of Large Language Model (LLM) based AI systems is moving at a breakneck speed, and the biggest technology companies are in a cut throat competition to push the state of the art and provide smarter systems at ever decreasing prices, which, in turn, fuels even more research to come up with better systems. Just to give a concrete example, even before I finished the book, I learned about a new development where two researchers from Cornell University developed "Contextual Document Embeddings". At this pace, some parts of the book will probably be outdated in at most a few years, so practitioners that build and integrated LLM-powered AI systems still have some time to put this valuable knowledge into real-life work.
As final remarks, I also appreciate that authors dedicated some chapters to the excellent Sentence Transformers (a.k.a. SBERT) Python library and its practical use-cases, as well as SetFit framework for efficient few-shot learning with Sentence Transformers. I also liked the nod to the venerable spaCy in one of the diagrams! ;)
This book strikes a good balance in how deeply the subject is presented. It is not as shallow as what we usually see in blog posts, nor as deep as an academic paper. I think this level of detail is ideal for developers and engineering managers who want to gain a better understanding of LLMs without delving too deeply into the mathematics. Additionally, it includes a lot of code examples.
The book also features numerous diagrams that help explain the topics.
However, it has some drawbacks. Not all subjects are well explained. At times, it presents diagrams without further explanation, as if they were self-explanatory—but that is not always the case. Furthermore, the chapter on LLM fine-tuning contains significantly fewer code examples than the others.
Overall, it is a very good book on a topic and at a level of detail that is not well covered by other books. However, there is room for improvement.
Hands-On Large Language Models by Jay Alammar and Maarten Grootendorst is a very practical and accessible guide for those with a solid background in applied math and some hands-on Python experience. The book stands out for its clear, well-designed diagrams that help demystify key concepts and make learning faster. However, some sections—particularly on image generation—lack clarity and could use more depth. It’s also a bit frustrating that the book doesn’t explore the inner workings of transformers in more detail, especially since Jay Alammar has already created excellent visual content on this topic in his blog. A bonus chapter on advanced techniques like GQA or Mixture of Experts would have been a great addition. Still, I give it 4 stars because it allowed me to progress quickly without diving too deep into GPU or devops topics, which made it very effective for my learning needs.
I picked up a copy of this book just before Christmas 2024. Several AI-related YouTube channels I follow had recommended it, and for good reason. This book is exactly what the title says "Hands-On" and is a great follow along way to get up to speed on running and using LLM's. The visual illustrations are stunning - very detailed and rich. One of the best ways to learn about technology is with visual intuitions and this book doesn't disappoint.
Although the book came out in October 2024 (which may seem like a long time ago in AI), it's aged very well and still very relevant. The author shares all of his code on GitHub so you can be assured of updates to the material as advances are made.
I was so inspired by this book that I decided to run an LLM locally on my Macbook. You can read about that in my blog post - https://blog.marketingdatascience.ai/...
So if you want to dabble in the technology and know a little Python, this book is for you. It's not a reference (you can find those online), nor is it a tutorial (you have YouTube for that). It's a great hands-on experience. Just reading about AI doesn't quite make it all sink in, sometimes you have to do AI and that's what this book is for. Enjoy.
This is not my typical book, but I dont particularly mind reading about concepts I have little knowledge on to get a better understanding of them.
This definitely was very helpful in understanding what Large Language Models are, and how they are use. I started this book after I read the article that talked about Transformers. The article attempted to break down the complex subject in a way that made it easy for a person who had no knowledge about LLMs and Transformers and what they do and how we initially use them everyday.
This is definitely an interesting book and I would recommend if you're into technology.
There's so much to love about this book! The illustrative nature that fortifies one's intuition over the extensive & expansive knowledge base in rapid AI development: is timely & apt! Maestro to Alammar & Grootendorst!
As a junior in the field of Large Language Models (LLMs), I found this book to be incredibly informative and well-structured. It provided a comprehensive overview of key concepts, including LLM architecture, embeddings, prompt engineering, and fine-tuning. The explanations were clear and accessible, making complex topics easier to grasp for someone still building their foundational knowledge.
What stood out to me was the practical approach the author took, offering actionable insights and examples that helped solidify my understanding.
I have read quite a few books on data science and machine learning because I teach professional education courses at Eindhoven University of Technology.
This book is currently in the top of the list of standard books that I use in my class. It is well-written, practical and does an excellent job in providing intuitive, visual explanations how large language models work, without having to go all out with linear algebra and intricate math.
I liked the balance this book struck between delving into the theory of the models/architectures and their applications. It’s probably the most up to date book on the field of NLP out there right now and it does a good job of getting readers up to speed with the latest applications. My only gripe would be that the last chapter was a little rushed given how important SFT is to making LLM applications work.
One of the best books I've read. If you want to understand the basic concepts of LLM and also get hands-on writing and trying out code, look no further. The book is structured to explain the concepts first and then do some hands-on exercises on that specific concept. Almost like a training. Also, the illustrations were simply great and lots of them to ingest the material much better. Worth the investment!
This book brings so much clarity for many terms about LLM. From the building blocks of transformer and fine-tune tokenizer, embedding, self-attention in Transformer architecture all the way to various real use-cases of utilizing LLM for solving various problems. I really recommend this amazing book. Concise and easy to comprehend.
Very beginner friendly book on the practical use of LLMs in practice. If you have some experience with LLMs and NLP along with some problems on hand this will be a quick guide to get you started. It combines well the overall understanding of the topic and the practical examples via presented Python code. The author use transformers library for their examples.
Illustrations were useful in many cases, but also misleading in some. Overall, a bit too theoretical with concepts, like LLM fine tuning explained using very “artificial” examples instead of real scenarios.
Focused, concise and to the point. Good structure. Well chosen subjects. Hope the book will be continue to be updated and expanded to cover more “ground” as “the field evolves”.
I really liked the illustrations and working examples in this book that make working with LLM more approachable. I did feel that the illustrations were sometimes stopping the flow of the material. Overall, a very good read for anyone wanting to play with the guts of LLMs.
To date, this is one of the best books on Large Language Models (LLMs) I have read. It is well-structured, easy to follow, and includes excellent diagrams that enhance the reader's understanding of each concept. A must-read for anyone eager to explore the field.