For GPT-2, Radford had been selective about what made it into the data. He scraped the text from articles and websites that had been shared on Reddit and received at least three upvotes on the platform. This had produced a forty-gigabyte trove of some eight million documents, which he named WebText.

