Evaluation Metrics#

SVR is used for predicting continuous values, so we rely on regression metrics (not classification ones like accuracy or precision). These metrics tell us how close the predicted values \(\hat{y}_i\) are to the actual values \(y_i\).


Mean Absolute Error (MAE)#

\[ MAE = \frac{1}{n} \sum_{i=1}^n |y_i - \hat{y}_i| \]
  • Average of absolute errors.

  • Easy to interpret since it’s in the same units as the target.

  • Treats all errors equally (linear penalty).

  • Good when you care about robustness and don’t want outliers to dominate.


Mean Squared Error (MSE)#

\[ MSE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2 \]
  • Penalizes larger errors more heavily (quadratic penalty).

  • Useful when you want to heavily discourage large mistakes.

  • But less interpretable because it’s in squared units.


Root Mean Squared Error (RMSE)#

\[ RMSE = \sqrt{MSE} \]
  • Same as MSE, but square root brings units back to the original scale.

  • Sensitive to outliers.

  • Most common metric in practice.


R-squared (\(R^2\))#

\[ R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2} \]
  • Proportion of variance in \(y\) explained by the model.

  • \(R^2 = 1\): perfect prediction.

  • \(R^2 = 0\): model no better than predicting the mean.

  • \(R^2 < 0\): model worse than mean prediction.

⚠️ Limitation: Always increases as you add more predictors, even if they don’t improve the model.


Adjusted R-squared (\(R^2_{adj}\))#

\[ R^2_{adj} = 1 - \frac{(1 - R^2)(n - 1)}{n - p - 1} \]

Where:

  • \(n\) = number of observations

  • \(p\) = number of predictors (features)

  • Penalizes useless features.

  • If a new predictor improves the model, \(R^2_{adj}\) increases.

  • If a predictor doesn’t help, \(R^2_{adj}\) decreases.

  • More reliable than \(R^2\) when you have multiple predictors.


Mean Absolute Percentage Error (MAPE)#

\[ MAPE = \frac{100}{n} \sum_{i=1}^n \left|\frac{y_i - \hat{y}_i}{y_i}\right| \]
  • Error expressed as a percentage of actual values.

  • Easy to communicate (e.g., “on average, predictions are off by 6%”).

  • ⚠️ Breaks down when \(y_i = 0\).


Explained Variance Score (EVS)#

\[ EVS = 1 - \frac{Var(y - \hat{y})}{Var(y)} \]
  • Similar to \(R^2\), but focuses on variance explained.

  • Higher is better.


Which Metrics to Use#

  • General evaluation: RMSE, MAE, \(R^2\).

  • When outliers matter: RMSE (heavier penalty).

  • When robustness matters: MAE.

  • When explaining variance: \(R^2\) and Adjusted \(R^2\).

  • When business users need % error: MAPE.


Example in Python#

import numpy as np
from sklearn.svm import SVR
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

# Sample dataset
X = np.random.rand(100, 3)  # 3 features
y = 3*X[:,0] - 2*X[:,1] + X[:,2] + np.random.randn(100) * 0.1

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVR
svr = SVR(kernel='rbf', C=100, gamma=0.1, epsilon=0.1)
svr.fit(X_train, y_train)
y_pred = svr.predict(X_test)

# Metrics
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)

# Adjusted R^2
n, p = X_test.shape
adj_r2 = 1 - ((1 - r2) * (n - 1)) / (n - p - 1)

print(f"MAE: {mae:.3f}")
print(f"MSE: {mse:.3f}")
print(f"RMSE: {rmse:.3f}")
print(f"R^2: {r2:.3f}")
print(f"Adjusted R^2: {adj_r2:.3f}")
MAE: 0.084
MSE: 0.009
RMSE: 0.095
R^2: 0.994
Adjusted R^2: 0.993

Summary Table

Metric

Formula

Strength

Weakness

MAE

(\frac{1}{n}\sum

y-\hat{y}

)

Easy to interpret, robust

Ignores error direction

MSE

\(\frac{1}{n}\sum (y-\hat{y})^2\)

Penalizes large errors

Units are squared

RMSE

\(\sqrt{MSE}\)

Same scale as target

Sensitive to outliers

\(R^2\)

\(1 - \frac{SS_{res}}{SS_{tot}}\)

Variance explained

Always increases with features

Adjusted \(R^2\)

Penalized \(R^2\)

Detects useless predictors

Only meaningful for multiple predictors

MAPE

% error

Easy to explain

Undefined if \(y=0\)

EVS

Variance explained

Intuitive

Similar to \(R^2\), less common