Matt Parker

59%
Flag icon
About 60 percent of the text that was used to train GPT-3, for instance, came from a dataset called Common Crawl. This is a free, massive, and regularly updated database that researchers use to collect raw web page data and text from billions of web pages.
Supremacy: AI, ChatGPT, and the Race that Will Change the World
Rate this book
Clear rating
Open Preview