Azka’s Kindle Notes & Highlights

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, by Foster Provost

what about supervised segmentations for regression problems — problems with a numeric target variable? Looking at reducing the impurity of the child subsets still makes intuitive sense, but information gain is not the right measure, because entropy-based information gain is based on the distribution of the properties in the segmentation. Instead, we would want a measure of the purity of the numeric (target) values in the subsets.