Jeffrey

13%
Flag icon
learn from carefully hand-labeled data. Quite often the quality of the AI’s predictions depends on the quality of the labels in the training data. However, a key ingredient of the LLM revolution is that for the first time very large models could be trained directly on raw, messy, real-world data, without the need for carefully curated and human-labeled data sets. As a result almost all textual data on the web became useful. The more the better. Today’s LLMs are trained on trillions of words. Imagine digesting Wikipedia wholesale, consuming all the subtitles and comments on YouTube, reading ...more
Jeffrey
The Unlabeled Feast: How LLMs Devour the Internet
The Coming Wave: AI, Power, and Our Future
Rate this book
Clear rating
Open Preview