Even without the trial-and-error of classic p-hacking, then, scientists who don’t come to their data with a proper plan can end up analysing themselves into an unreplicable corner. Why unreplicable? Because when they reach each fork in the path, the scientist is being strung along by the data: making a choice that looks like it might lead to p < 0.05 in that dataset, but that won’t necessarily do the same in others. This is the trouble with all kinds of p-hacking, whether explicit or otherwise: they cause the analysis to – using the technical term – overfit the data.73 In other words, the
...more