Decision Tree Regressor#
A Decision Tree Regressor is the regression version of decision trees.
Instead of predicting a class label (like “Iris-setosa” or “Iris-versicolor”), it predicts a continuous value (like house price, temperature, sales).
The dataset is recursively split based on features, but instead of maximizing classification purity (Gini/Entropy), we minimize the variance (or mean squared error) in the target values.
How it Works (Step by Step)#
Start at the root node (whole dataset).
At each split:
Choose the feature & threshold that minimizes a cost function.
Common cost functions for regression:
Mean Squared Error (MSE)
\[ MSE = \frac{1}{n}\sum_{i=1}^n (y_i - \hat{y})^2 \]Mean Absolute Error (MAE)
\[ MAE = \frac{1}{n}\sum_{i=1}^n |y_i - \hat{y}| \]
Here, \(\hat{y}\) is the mean (or median) of values in that node.
Split until stopping criteria (max depth, min samples per leaf, etc.).
Prediction: For a new sample, traverse the tree and return the mean value of the leaf node it falls into.
Example Use Cases#
Predicting house prices from size, location, number of bedrooms.
Forecasting stock values (though prone to overfitting).
Estimating energy consumption from temperature & household data.
Advantages#
Simple, interpretable.
Captures non-linear relationships.
Handles both numerical & categorical features.
Disadvantages#
Prone to overfitting (deep trees fit noise).
Piecewise constant predictions (not smooth).
Sensitive to small changes in data.