Cost Functions#
1. Mean Squared Error (MSE)#
The most common cost function.
At each split, the tree chooses the feature & threshold that minimizes the variance of the target values.
Here:
\(y_i\) = actual value
\(\hat{y}\) = predicted value (mean of samples in that leaf)
\(n\) = number of samples in the node
👉 Minimizing MSE means nodes will group data where target values are close together.
2. Mean Absolute Error (MAE)#
Uses the median of values in the node for prediction (instead of mean).
More robust to outliers than MSE.
3. Friedman’s Mean Squared Error (Friedman MSE)#
A variation of MSE used in
scikit-learn.Adds a correction term to reduce bias when splitting nodes, especially useful in gradient boosting trees.
4. Poisson (for count regression)#
For target values that represent counts (non-negative integers).
Cost function is based on Poisson deviance:
Summary
MSE → default, sensitive to outliers, good for general regression.
MAE → robust to outliers, gives median-based predictions.
Friedman MSE → specialized, often used in ensembles.
Poisson → best for count-based data.