More on this book
Community
Kindle Notes & Highlights
The choice of 0.05 isn’t because of any special logical or statistical reasons, but it has become scientific convention through decades of common use.
Kenneth Rothman, an associate editor at the American Journal of Public Health in the mid-1980s, began returning submissions with strongly worded letters: All references to statistical hypothesis testing and statistical significance should be removed from the paper. I ask that you delete p values as well as comments about statistical significance. If you do not agree with my standards (concerning the inappropriateness of significance tests), you should feel free to argue the point, or simply ignore what you may consider to be my misguided view, by publishing elsewhere.12 During Rothman’s
...more
Curiously, the problem of underpowered studies has been known for decades, yet it is as prevalent now as it was when first pointed out. In 1960 Jacob Cohen investigated the statistical power of studies published in the Journal of Abnormal and Social Psychology8 and discovered that the average study had only a power of 0.48 for detecting medium-sized effects.[
top-ranked journals, such as Nature and Science, prefer to publish studies with groundbreaking results — meaning large effect sizes in novel fields with little prior research.
Statistical techniques do not magically eliminate dependence between measurements or allow you to obtain good results with poor experimental design. They merely provide ways to quantify dependence so you can correctly interpret your data. (This means they usually give wider confidence intervals and larger p values than the naive analysis.)
When news came from the Large Hadron Collider that physicists had discovered evidence for the Higgs boson, a long-theorized fundamental particle, every article tried to quote a probability: “There’s only a 1 in 1.74 million chance that this result is a fluke,” or something along those lines. But every news source quoted a different number. Not only did they ignore the base rate and misinterpret the p value, but they couldn’t calculate it correctly either.