To solve OpenAI’s data bottleneck, Brockman turned to a new source: YouTube. OpenAI had previously avoided this option—scraping YouTube to train OpenAI’s models, YouTube’s CEO would later confirm, violated the platform’s terms of service. But under the new existential pressure for more data, the question became whether YouTube, or its parent, Google, would enforce it. If Google cracked down, it could jeopardize its own ability to scrape other websites for its large language model development. Brockman was willing to take the risk.

