Loss Functions
MSE, cross-entropy, Huber, and when to pick each.
The loss function defines what the network is being trained to do. Common choices:
Mean Squared Error (MSE): . Standard for regression. Sensitive to outliers.Mean Absolute Error (MAE) / Huber: or a smooth blend. More robust to outliers, harder to optimize than MSE.Cross-entropy: for classification, . Softmax + cross-entropy is the standard for multi-class.Binary cross-entropy: .Custom losses: Sharpe-aware loss, drawdown-penalized loss, asymmetric loss for asymmetric costs of over- vs. under-prediction.
Pick the loss to match the cost structure of mistakes. In trading, the cost of a wrong-direction prediction usually dwarfs the cost of size, so directionally-aware losses (rank loss, sign-prediction) often outperform pure MSE on returns.