The Art of Statistics: Learning from Data
Rate it:
Open Preview
Read between January 12, 2021 - December 8, 2023
49%
Flag icon
a confidence interval is the range of population parameters for which our observed statistic is a plausible consequence.
51%
Flag icon
A hypothesis can be defined as a proposed explanation for a phenomenon.
51%
Flag icon
term apophenia to describe the capacity to see patterns where they do not exist, and it has been suggested that this tendency might even confer an evolutionary advantage – those ancestors who ran away from rustling in the bushes without waiting to find out whether it was definitely a tiger may have been more likely to survive.
52%
Flag icon
while this attitude may be fine for hunter-gatherers, it cannot work in science – indeed, the whole scientific process is undermined if claims are just figments of our imagination. There must be a way of protecting us against false discoveries, and hypothesis testing attempts to fill that role.
53%
Flag icon
A P-value is the probability of getting a result at least as extreme as we did, if the null hypothesis (and all other modelling assumptions) were really true.
75%
Flag icon
the approach to statistical inference in which probability is used not only for aleatory uncertainty, but also epistemic uncertainty about unknown facts. Bayes’ theorem is then used to revise these beliefs in the light of new evidence.
75%
Flag icon
Central Limit Theorem: the tendency for the sample mean of a set of random variables to have a normal sampling distribution, regardless (with certain exceptions) of the shape of the underlying sampling distribution of the random variable. If n independent observations each have mean μ and variance σ2, then under broad assumptions their sample mean is an estimator of μ, and has an approximately normal distribution with mean μ, variance σ2/n, and standard deviation (also known as the standard error of the estimator).
77%
Flag icon
interactions: when multiple explanatory variables combine to produce an effect different from that expected from their individual contributions.
77%
Flag icon
Law of Large Numbers: the process by which the sample mean of a set of random variables tends towards the population mean.
78%
Flag icon
machine learning: procedures for extracting algorithms, say for classification, prediction or clustering, from complex data.
80%
Flag icon
prosecutor’s fallacy: when a small probability of the evidence, given innocence, is mistakenly interpreted as the probability of innocence, given the evidence.
81%
Flag icon
signal and the noise: the idea that observed data arises from two components: a deterministic signal which we are really interested in, and random noise that comprises the residual error. The challenge of statistical inference is to appropriately identify the two, and not be misled into thinking that noise is actually a signal.
81%
Flag icon
Simpson’s paradox: when an apparent relationship reverses its sign when a confounding variable is taken into account.
81%
Flag icon
skewed distribution: when a sample or population distribution is highly asymmetric, and has a long left- or right-hand tail. This might typically occur for variables such as income and sales of books, when there is extreme inequality. Standard measures (such as means) and standard deviations can be very misleading for such distributions.
98%
Flag icon
the ‘law of the transposed conditional’, which sounds delightfully obscure, but simply means that the probability of A given B is confused with the probability of B given A.
« Prev 1 2 Next »