Performance Metrics#

1. Mean Absolute Error (MAE)#

Definition: The average of the absolute differences between the predicted and actual values.

\[ MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i| \]
  • \(y_i\) = actual value

  • \(\hat{y}_i\) = predicted value

  • \(n\) = number of samples

Interpretation:

  • MAE is in the same units as the target variable.

  • Lower MAE → predictions are closer to actual values.

Example: If the true prices are [100, 200, 300] and predictions are [110, 190, 310]:

\[ MAE = \frac{|100-110| + |200-190| + |300-310|}{3} = \frac{10+10+10}{3} = 10 \]

2. Mean Squared Error (MSE)#

Definition: Average of the squared differences between predicted and actual values.

\[ MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \]

Interpretation:

  • Penalizes larger errors more than MAE (due to squaring).

  • Useful if you want to heavily penalize large deviations.

Example: Using the same example:

\[ MSE = \frac{(100-110)^2 + (200-190)^2 + (300-310)^2}{3} = \frac{100+100+100}{3} = 100 \]

3. Root Mean Squared Error (RMSE)#

Definition: Square root of MSE.

\[ RMSE = \sqrt{MSE} \]

Interpretation:

  • RMSE is in the same units as the target variable.

  • Gives a sense of typical magnitude of prediction errors.

Example:

\[ RMSE = \sqrt{100} = 10 \]

4. R-squared (\(R^2\))#

Definition: Proportion of variance in the dependent variable that is predictable from the independent variables.

\[ R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2} \]
  • \(\bar{y}\) = mean of actual values

Interpretation:

  • \(R^2 = 1\) → perfect prediction

  • \(R^2 = 0\) → model predicts as well as the mean of target

  • Can be negative if model is worse than predicting the mean


5. Adjusted R-squared (optional)#

  • Useful when you have multiple features.

  • Penalizes for adding irrelevant features.


Summary Table#

Metric

Formula

Interpretation

Units

MAE

( \frac{1}{n}\sum

y_i - \hat{y}_i

)

Average absolute error

Same as target

MSE

\(\frac{1}{n}\sum (y_i - \hat{y}_i)^2\)

Penalizes large errors

Squared units

RMSE

\(\sqrt{MSE}\)

Typical error magnitude

Same as target

\(1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}\)

Explained variance

Unitless

# Step 1: Import Libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# Step 2: Create a sample dataset
data = {
    'Feature1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    'Feature2': [5, 3, 6, 2, 7, 8, 5, 9, 4, 10],
    'Target': [10, 12, 15, 14, 18, 20, 19, 22, 21, 25]
}
df = pd.DataFrame(data)

# Features and target
X = df[['Feature1', 'Feature2']]
y = df['Target']

# Step 3: Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Step 4: Train Decision Tree Regressor
dtr = DecisionTreeRegressor(random_state=42)
dtr.fit(X_train, y_train)

# Step 5: Make predictions
y_pred = dtr.predict(X_test)

# Step 6: Evaluate performance
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)

# Step 7: Display results
print("Predictions:", y_pred)
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
print(f"R-squared (R²): {r2:.2f}")
Predictions: [22. 10. 18.]
Mean Absolute Error (MAE): 1.67
Mean Squared Error (MSE): 3.00
Root Mean Squared Error (RMSE): 1.73
R-squared (R²): 0.82
The Kernel crashed while executing code in the current cell or a previous cell. 

Please review the code in the cell(s) to identify a possible cause of the failure. 

Click <a href='https://aka.ms/vscodeJupyterKernelCrash'>here</a> for more info. 

View Jupyter <a href='command:jupyter.viewOutput'>log</a> for further details.