Performance Metrics#
1. Mean Absolute Error (MAE)#
Definition: The average of the absolute differences between the predicted and actual values.
\(y_i\) = actual value
\(\hat{y}_i\) = predicted value
\(n\) = number of samples
Interpretation:
MAE is in the same units as the target variable.
Lower MAE → predictions are closer to actual values.
Example:
If the true prices are [100, 200, 300] and predictions are [110, 190, 310]:
2. Mean Squared Error (MSE)#
Definition: Average of the squared differences between predicted and actual values.
Interpretation:
Penalizes larger errors more than MAE (due to squaring).
Useful if you want to heavily penalize large deviations.
Example: Using the same example:
3. Root Mean Squared Error (RMSE)#
Definition: Square root of MSE.
Interpretation:
RMSE is in the same units as the target variable.
Gives a sense of typical magnitude of prediction errors.
Example:
4. R-squared (\(R^2\))#
Definition: Proportion of variance in the dependent variable that is predictable from the independent variables.
\(\bar{y}\) = mean of actual values
Interpretation:
\(R^2 = 1\) → perfect prediction
\(R^2 = 0\) → model predicts as well as the mean of target
Can be negative if model is worse than predicting the mean
5. Adjusted R-squared (optional)#
Useful when you have multiple features.
Penalizes for adding irrelevant features.
Summary Table#
Metric |
Formula |
Interpretation |
Units |
||
|---|---|---|---|---|---|
MAE |
( \frac{1}{n}\sum |
y_i - \hat{y}_i |
) |
Average absolute error |
Same as target |
MSE |
\(\frac{1}{n}\sum (y_i - \hat{y}_i)^2\) |
Penalizes large errors |
Squared units |
||
RMSE |
\(\sqrt{MSE}\) |
Typical error magnitude |
Same as target |
||
R² |
\(1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}\) |
Explained variance |
Unitless |
# Step 1: Import Libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
# Step 2: Create a sample dataset
data = {
'Feature1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Feature2': [5, 3, 6, 2, 7, 8, 5, 9, 4, 10],
'Target': [10, 12, 15, 14, 18, 20, 19, 22, 21, 25]
}
df = pd.DataFrame(data)
# Features and target
X = df[['Feature1', 'Feature2']]
y = df['Target']
# Step 3: Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Step 4: Train Decision Tree Regressor
dtr = DecisionTreeRegressor(random_state=42)
dtr.fit(X_train, y_train)
# Step 5: Make predictions
y_pred = dtr.predict(X_test)
# Step 6: Evaluate performance
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)
# Step 7: Display results
print("Predictions:", y_pred)
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
print(f"R-squared (R²): {r2:.2f}")
Predictions: [22. 10. 18.]
Mean Absolute Error (MAE): 1.67
Mean Squared Error (MSE): 3.00
Root Mean Squared Error (RMSE): 1.73
R-squared (R²): 0.82
The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click <a href='https://aka.ms/vscodeJupyterKernelCrash'>here</a> for more info.
View Jupyter <a href='command:jupyter.viewOutput'>log</a> for further details.