Simply put, Cross-Validation means assessing not only how well a model fits the data it’s given, but how well it generalizes to data it hasn’t seen. Paradoxically, this may involve using less data. In the marriage example, we might “hold back,” say, two points at random, and fit our models only to the other eight. We’d then take those two test points and use them to gauge how well our various functions generalize beyond the eight “training” points they’ve been given. The two held-back points function as canaries in the coal mine: if a complex model nails the eight training points but wildly
...more