Again and again we were asked about the mathematical core of the machine learning algorithm we chose—a probabilistic technique called a “Bayesian network”—but not a single question about the data we trained it on. While that wasn’t unusual—data was not so subtly dismissed as an inert commodity that only mattered to the extent that algorithms required it—I began to realize that we’d underestimated something important.

