More on this book
Community
Kindle Notes & Highlights
Read between
April 27 - May 26, 2020
So let’s put all this together in Bayes’ theorem, which simply says that the initial odds for a hypothesis × the likelihood ratio = the final odds for the hypothesis For the doping example, the initial odds for the hypothesis ‘the athlete is doping’ is 1/49, and the likelihood ratio is 19, so Bayes’ theorem says the final odds are given by 1/49 × 19 = 19/49 These odds of 19/49 can be transformed to a probability of 19/(19+49) = 19/68 = 28%. So this probability, which was obtained from the expected frequency tree in a rather simple way, can also be derived from the general equation for Bayes’
...more
This value of 3/7 may seem odd, as the intuitive estimate might be 2/5 – the proportion of red balls landing to the left of the line. Instead Bayes showed that in these circumstances we should estimate the position as This means, for example, that before any red balls are thrown at all, we can estimate the position to be (0 + 1)/(0 + 2) = ½, whereas the intuitive approach might suggest that we could not give any answer since there is not yet any data. Essentially Bayes is making use of the information about how the position of the line has been initially decided, since we know it is picked at
...more
This highlight has been truncated due to consecutive passage length restrictions.
The Bayesian response to this problem is known as multi-level regression and post-stratification (MRP). The basic idea is to break down all possible voters into small ‘cells’, each comprising a highly homogeneous group of people – say living in the same area, with the same age, gender, past voting behaviour, and other measurable characteristics. We can use background demographic data to estimate the number of people in each of these cells, and these are all assumed to have the same probability of voting for a certain party. The problem is working out what this probability is, when our
...more
This highlight has been truncated due to consecutive passage length restrictions.
Robert Kass and Adrian Raftery are two renowned Bayesian statisticians who proposed a widely used scale for Bayes factors, shown in Table 11.3. Note the contrast to the scale in Table 11.2 for verbally interpreting likelihood ratios for legal cases, where a likelihood ratio of 10,000 was required to declare the evidence as ‘very strong’, in contrast to scientific hypotheses only needing a Bayes factor of greater than 150. This perhaps reflects the need to establish criminal guilt ‘beyond reasonable doubt’, whereas scientific claims are made on weaker evidence, with many being overturned on
...more
This highlight has been truncated due to consecutive passage length restrictions.
Activities that are intended to create statistically significant results have come to be known as ‘P-hacking’, and although the most obvious technique is to carry out multiple tests and report the most significant, there are many more subtle ways in which researchers can exercise their degrees of freedom. Does listening to the Beatles’ song ‘When I’m Sixty-Four’ make you younger? You might feel fairly confident about the correct answer to this question. Which makes it all the more impressive that Simonsohn and colleagues managed, admittedly by some fairly devious means, to get a significant
...more
This highlight has been truncated due to consecutive passage length restrictions.
Ten Questions to Ask When Confronted by a Claim Based on Statistical Evidence HOW TRUSTWORTHY ARE THE NUMBERS? How rigorously has the study been done? For example, check for ‘internal validity’, appropriate design and wording of questions, pre-registration of the protocol, taking a representative sample, using randomization, and making a fair comparison with a control group. What is the statistical uncertainty / confidence in the findings? Check margins of error, confidence intervals, statistical significance, sample size, multiple comparisons, systematic bias. Is the summary appropriate?
...more
This highlight has been truncated due to consecutive passage length restrictions.
These ‘rules’ should be fairly self-evident, and rather neatly summarize the issues tackled in this book. Statistical methods should enable data to answer scientific questions: Ask ‘why am I doing this?’, rather than focusing on which particular technique to use. Signals always come with noise: It is trying to separate out the two that makes the subject interesting. Variability is inevitable, and probability models are useful as an abstraction. Plan ahead, really ahead: This includes the idea of pre-specification in confirmatory experiments – avoiding researcher degrees of freedom Worry about
...more