Evaluation Metrics#
Confusion Matrix (Foundation of Metrics)#
For a binary classifier, the confusion matrix is:
Predicted Positive |
Predicted Negative |
|
|---|---|---|
Actual Positive |
True Positive (TP) |
False Negative (FN) |
Actual Negative |
False Positive (FP) |
True Negative (TN) |
From this table, all evaluation metrics are derived.
Accuracy#
Measures overall correctness of predictions.
Good when classes are balanced.
Can be misleading if dataset is imbalanced.
Precision, Recall, and F1-Score#
✅ Precision#
Out of predicted positives, how many are actually positive?
High precision → few false alarms.
Recall (Sensitivity / TPR)#
Out of actual positives, how many did we catch?
High recall → few missed detections.
F1-Score#
Harmonic mean of precision & recall.
Useful when dataset is imbalanced and we need a balance.
ROC Curve & AUC#
ROC Curve plots Recall (TPR) vs False Positive Rate (FPR).
AUC (Area Under Curve):
Closer to 1 → better classifier.
0.5 → random guessing.
For SVC, use decision_function or predict_proba (probability=True) to calculate these.
Metrics for Multi-class SVC#
Since SVC uses One-vs-Rest (OvR) or One-vs-One (OvO):
Macro average → averages metric across all classes equally.
Weighted average → averages metric weighted by class frequency.
These are shown in classification_report in scikit-learn.
Other Advanced Metrics#
Balanced Accuracy → adjusts accuracy for imbalanced datasets.
Cohen’s Kappa → measures agreement beyond chance.
Matthews Correlation Coefficient (MCC): robust for imbalanced data.
Summary Table
Metric |
Meaning |
Best Use |
|---|---|---|
Accuracy |
Overall correctness |
Balanced data |
Precision |
Correct positive predictions / All predicted positives |
When false alarms costly |
Recall |
Correct positive predictions / All actual positives |
When missing positives costly |
F1-Score |
Balance of precision & recall |
Imbalanced data |
ROC-AUC |
Ranking ability |
Threshold selection |
MCC |
Correlation between predictions & truth |
Imbalanced data |