Evaluation Metrics#


Confusion Matrix (Foundation of Metrics)#

For a binary classifier, the confusion matrix is:

Predicted Positive

Predicted Negative

Actual Positive

True Positive (TP)

False Negative (FN)

Actual Negative

False Positive (FP)

True Negative (TN)

From this table, all evaluation metrics are derived.


Accuracy#

\[ \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN} \]
  • Measures overall correctness of predictions.

  • Good when classes are balanced.

  • Can be misleading if dataset is imbalanced.


Precision, Recall, and F1-Score#

✅ Precision#

\[ \text{Precision} = \frac{TP}{TP + FP} \]
  • Out of predicted positives, how many are actually positive?

  • High precision → few false alarms.

Recall (Sensitivity / TPR)#

\[ \text{Recall} = \frac{TP}{TP + FN} \]
  • Out of actual positives, how many did we catch?

  • High recall → few missed detections.

F1-Score#

\[ F1 = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}} \]
  • Harmonic mean of precision & recall.

  • Useful when dataset is imbalanced and we need a balance.


ROC Curve & AUC#

  • ROC Curve plots Recall (TPR) vs False Positive Rate (FPR).

  • AUC (Area Under Curve):

    • Closer to 1 → better classifier.

    • 0.5 → random guessing.

For SVC, use decision_function or predict_proba (probability=True) to calculate these.


Metrics for Multi-class SVC#

Since SVC uses One-vs-Rest (OvR) or One-vs-One (OvO):

  • Macro average → averages metric across all classes equally.

  • Weighted average → averages metric weighted by class frequency.

These are shown in classification_report in scikit-learn.


Other Advanced Metrics#

  • Balanced Accuracy → adjusts accuracy for imbalanced datasets.

  • Cohen’s Kappa → measures agreement beyond chance.

  • Matthews Correlation Coefficient (MCC): robust for imbalanced data.


Summary Table

Metric

Meaning

Best Use

Accuracy

Overall correctness

Balanced data

Precision

Correct positive predictions / All predicted positives

When false alarms costly

Recall

Correct positive predictions / All actual positives

When missing positives costly

F1-Score

Balance of precision & recall

Imbalanced data

ROC-AUC

Ranking ability

Threshold selection

MCC

Correlation between predictions & truth

Imbalanced data