Data scientists have excellent tools for assessing the amount of data required given the expected reliability of the prediction and the need for accuracy. These tools are called “power calculations” and tell you how many units you need to analyze to generate a useful prediction.