Quant GT
Browse all lessons
Section 11 · Lesson 11.5

Logistic and Generalized Linear Models

Regression when the response isn't normally distributed.

When yy is binary, a count, or non-negative, OLS gives nonsense (negative probabilities, fractional counts). Generalized Linear Models (GLMs) fix this by linking the mean of yy to XβX\beta through a link function.

Logistic regression handles binary y{0,1}y \in \{0, 1\}:

P(y=1x)=σ(Xβ)=11+eXβP(y = 1 \mid x) = \sigma(X\beta) = \frac{1}{1 + e^{-X\beta}}

Fit by maximum likelihood. The coefficients have a clean interpretation as changes in log-odds.

Other widely used GLMs:

Poisson regression for count data uses logE[y]=Xβ\log E[y] = X\beta.Gamma regression for positive continuous data uses 1/E[y]=Xβ1/E[y] = X\beta.

The general recipe is: pick an exponential-family distribution for yy and a link function relating its mean to XβX\beta. Interpretation becomes nonlinear, but residual diagnostics, regularization, and inference all carry over from OLS.