Quant GT
Browse all lessons
Section 11 · Lesson 11.2

Multiple Linear Regression

Many predictors, one response — and the interpretations get subtle.

Multiple linear regression generalizes simple regression to many predictors:

y=β0+β1x1+β2x2++βpxp+ϵy = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p + \epsilon

Each β^j\hat{\beta}_j is the effect of xjx_j on yy holding the other predictors fixed. This conditional interpretation is what makes multiple regression powerful, and it's also what gets confused most often.

Three traps to know:

Multicollinearity. If predictors are highly correlated, individual coefficient variances explode and standard errors balloon. The model still predicts well, but you can't reliably interpret individual coefficients.Omitted variable bias. Leaving out a relevant predictor that correlates with included ones biases the included coefficients.Interaction effects. Adding β12x1x2\beta_{12} x_1 x_2 lets the effect of x1x_1 depend on the value of x2x_2 — important whenever effects are non-additive.

In quant trading, multivariate regressions on factor returns are the foundation of factor models. The coefficients are the loadings — how much your portfolio moves with the market, with size, with value, and so on.