Dan Pfeiffer

8%
Flag icon
Many of the AI companies keep the source text they train from, called training corpuses, secret, but a typical example of training data largely consists of text pulled from the internet, public domain books and research articles, and assorted other free sources of content that researchers can find. Actually looking into these sources in detail reveals some odd material. For example, the entire email database of Enron, shut down for corporate fraud, is used as part of the training material for many AIs, simply because it was made freely available to AI researchers. Similarly, there is a ...more
Co-Intelligence: Living and Working with AI
Rate this book
Clear rating
Open Preview