I highly recommend this book for anyone planning (or considering) to do science, either a bachelors, masters or more. It's a great overview of how science is actually practiced, and how it can so easily go wrong. I also recommend this to current scientists, because it's a humbling reminder of what we're doing wrong, and also a quick update on things we might have been taught as facts has actually been disproven in the meantime.
The book is exceptionally well structured, very clear writing, very engaging, switching between as much information as needed to understand a given concept, then compelling examples, and discussion as to why it matters, what people might object to, etc. Really really good.
However, the author fails to give proper due to the main strength of science: it's ability to self-correct. This book is described as an "expose", but in reality all of what he mentions has been known for decades, and in fact every single example he gives of fraud, negligence, bias, or unwarranted hype was not uncovered by external journalists but rather other scientists. It was the peers who read papers that looked suspicious and did more digging, or whole careers built around developing software and tools for automatically detecting plagiarism, statistical errors, etc. It was psychology itself that "wrote the book" on bias that was fundamental to exposing the biases of scientists themselves. And more often than not, it was just a future study that tried something better that should have worked but didn't that disproved a flimsy hypothesis. Sure; fraud, hype, bias, and negligence are dragging science down, but science isn't "broken", it's just inefficient. Wasting a lot of money on bad experiments and scientists needs to be avoided, but in the end, a better truth tends to bubble up regardless. Anyone who has had to defend science against religious diehards will be particularly aware of this.
Also missing is proper consideration as to why these seemingly blindingly obvious problems have been going on for so long. As an insider, here are some of my answers:
- All this p-hacking (trying different analyses until something is significant). Scientists are not neatly divided into those that immediately find their results because of how fantastically well they planned their study, and those that desperately try to make their rotten data significant. Every. Single. Study. has to fine tune their analysis once they get the data, not before. Unless you are in fact replicating something, you have absolutely no idea what the data will look like, and what's the most meaningful way to look at it. This means you can’t just tell scientists "stop p-hacking!", you need an approach that acknowledges this critical step. Fortunately, an honest one exists that can be borrowed from machine learning: splitting your data into a "training" and "testing" dataset, where you fine-tune your analysis pipeline on a small subset, then you actually rely on the results applied to a larger one, using only and exactly the pipeline you previously developed, without further tweaking.
- The file drawer problem (null results not getting published). I think especially in the field of psychology, statistics courses are to blame for this; we don't reeeally understand how the stats work, so we rely on Important Things To Remember that we're taught by statisticians, and one of these is that "you can't prove a null hypothesis". This ends up getting interpreted in practice in "null results are not real results, because nothing was proven". We are actively discouraged from interpreting "absence of evidence as evidence of absence", but sometimes that is in fact exactly what we should be doing; for sure not with the same confidence and in the same way with which we interpret statistically significant positive results, but at some point, a study that should have found something but didn't is a meaningful indication that that thing might not in fact be there. A useful tool to help break through this narrow-minded focus on only positive results is a new statistical tool called similarity testing, where you test not whether two groups are different but whether they are statistically significantly "the same". This is a huge shift in mindset for many psychologists, who suddenly learn that you can in fact have a legitimate result that there was no difference to be found. Knowing this I suspect will make people less wary of null results in general.
- Proper randomization (and generally the practicalities of data collection). The author at some point calls it a mistake that a trial on the Mediterranean Diet had assigned to the same family unit the same diet, thus breaking the randomization. For the love of God, does he not know how families work? You cannot honestly ask members of the same family to eat differently! Sure, the authors should have implemented proper statistical corrections for this, but sometimes you have to design experiments for reality, not a spherical world.
- Reviewers nudging authors to cite them. This may seem like a form of blatant self-promotion, but it's worth mentioning that in reality, the peer reviewers were specifically selected as members of the EXACT SAME FIELD, and so odds are good that they have in fact published relevant work, and odds are even better that they are familiar with it enough to recommend it. That is not to say that some of it is for racking up citations, but this is not true unless proven otherwise, because legitimate alternative explanations exist.
Other little detail not mentioned by the author is that good science is f*cking hard. For my current experiment, I need a passing understanding of electrical engineering to run the recording equipment, a basic understanding of signal processing and matrix mathematics to clean and analyze the data, a good understanding of psychology for experimental design, a deep understanding of neuroscience for the actual field I'm experimenting in, a solid grasp of statistics, sufficient English writing skills, separate coding skills for both experimental tasks and data analysis in two different languages, and suddenly a passing understanding of hospital-grade hygiene practices to deal with COVID! There's just SO MUCH that can go wrong, and a failure at any point is going to ruin everything else. It's exhausting to juggle all that, and honestly, it's amazing that we have any valid results coming out at all. The only real solution to this is to have larger teams; focus less on individual achievements. The more eyes you have on scripts, the fewer bugs there will be; the more assistants available to collect data, the fewer mishaps. The more people reading the paper beforehand, the fewer mistakes slip through. We need publications from labs, not author lists; it can be specified somewhere the exact contribution of each, but science needs to move away from this model of venerating the individual, because this is not the 19th century anymore: the best science comes from groups. On CVs, we shouldn’t write lists of publications, we should write project descriptions (and cite the paper as “further reading”, not as an end in and of itself).
~~~
Scientists need the wakeup call from this book. Journalists and interested laymen also greatly benefit from understanding why a healthy dose of scepticism is needed towards any single scientific result, and how scientists are humans too. But the take-home message that can transpire from this book and is not actually true, is that scientists are either incompetent or dishonest or both. The author repeatedly bashes poor science and science communication that has eroded public trust in science, but ironically this book is essentially highlighting this with neon letters and making sure trust in science is eroded. To some extent it is warranted, but the author could have done more to defend the institution where it is deserved, and as an insider, could have done more to talk about the realities an individual scientist faces when they make these poor decisions. It's worth mentioning that science has not gotten worse, we're still making discoveries, still disproving our colleagues, and still improving quality of life. We could just be doing it more efficiently.