Quant GT
Browse all lessons
Section 11 · Lesson 11.4

Regularization

Ridge, Lasso, and Elastic Net — taming overfitting.

Regularization adds a penalty term to the OLS loss to discourage large coefficients, trading bias for variance.

Ridge regression uses an L2L_2 penalty:

minβyXβ2+λβ22\min_\beta \|y - X\beta\|^2 + \lambda \|\beta\|_2^2

It shrinks all coefficients toward zero and has a closed-form solution. Especially useful when predictors are correlated.

Lasso uses an L1L_1 penalty:

minβyXβ2+λβ1\min_\beta \|y - X\beta\|^2 + \lambda \|\beta\|_1

Lasso drives some coefficients to exactly zero, doubling as variable selection. The optimization is convex but lacks a closed form (use coordinate descent).

Elastic Net combines L1L_1 and L2L_2 penalties — better than Lasso when groups of correlated predictors should all enter or none.

Regularization trades a small bias for a much larger reduction in variance, which usually improves out-of-sample MSE. That's exactly what matters in trading: prediction quality on tomorrow's data, not in-sample R2R^2.