Quant GT
Browse all lessons
Section 12 · Lesson 12.4

Correlation vs Causation

What correlations can and cannot tell you.

Correlation measures linear association. Causation says that changing one variable would change the other. They are not the same, but the confusion is one of the most expensive in finance and elsewhere.

A non-zero correlation between XX and YY can arise from any of:

XX causes YY.YY causes XX.Both are caused by a confounder ZZ.Sample selection or survivorship bias.Pure chance, especially in small samples.

Distinguishing among these requires either a controlled experiment (randomization breaks the back-door paths to confounders) or careful causal modeling — instrumental variables, difference-in-differences, or natural experiments.

In trading, the correlation between two assets last year is no guarantee they'll co-move next year. Spurious correlations are everywhere when you data-mine across thousands of factors. The best correction is theory and out-of-sample validation, not just bigger R2R^2.