How was confounding defined then, and how should it be defined? Armed with what we now know about the logic of causality, the answer to the second question is easier. The quantity we observe is the conditional probability of the outcome given the treatment, P(Y | X). The question we want to ask of Nature has to do with the causal relationship between X and Y, which is captured by the interventional probability P(Y | do(X)). Confounding, then, should simply be defined as anything that leads to a discrepancy between the two: P(Y | X) ≠ P(Y | do(X)). Why all the fuss? Unfortunately, things were
...more