Quant GT
Browse all lessons
Section 11 · Lesson 11.1

Ordinary Least Squares

The line that minimizes squared residuals.

Given data (xi,yi)(x_i, y_i), OLS picks coefficients (α,β)(\alpha, \beta) to minimize the sum of squared residuals i(yiαβxi)2\sum_i (y_i - \alpha - \beta x_i)^2. The closed-form solutions are

β^=Cov(x,y)Var(x),α^=yˉβ^xˉ\hat{\beta} = \frac{\mathrm{Cov}(x, y)}{\mathrm{Var}(x)}, \qquad \hat{\alpha} = \bar{y} - \hat{\beta}\, \bar{x}

In matrix form, with response vector yy and design matrix XX:

β^=(XX)1Xy\hat{\beta} = (X^\top X)^{-1} X^\top y

OLS rests on five assumptions: linearity, independent errors, homoscedasticity (constant error variance), no perfect multicollinearity, and (for classical inference) Normally distributed errors. Under the Gauss-Markov conditions (the first four), OLS is BLUE — the best linear unbiased estimator — even without Normality. Normality just buys you exact tt- and FF-distributions for inference.

In quant finance, OLS is the warhorse of factor models — regressing returns on market, size, value, and other factors gives you a portfolio's exposures.