Doug Lautzenheiser

22%
Flag icon
To avoid the drawback of simple random sampling, you can first divide your population into the groups that you care about and sample from each group separately. For example, to sample 1% of data that has two classes, A and B, you can sample 1% of class A and 1% of class B. This way, no matter how rare class A or B is, you’ll ensure that samples from it will be included in the selection. Each group is called a stratum, and this method is called stratified sampling. One drawback of this sampling method is that it isn’t always possible, such as when it’s impossible to divide all samples into ...more
Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications
Rate this book
Clear rating
Open Preview