Let’s say that we want to ask the ECLS data a fundamental question about parenting and education: does having a lot of books in your home lead your child to do well in school? Regression analysis can’t quite answer that question, but it can answer a subtly different one: does a child with a lot of books in his home tend to do better than a child with no books? The difference between the first and second questions is the difference between causality (question 1) and correlation (question 2). A regression analysis can demonstrate correlation, but it doesn’t prove cause.

