Performance Metrics#
1. Classification Metrics#
When KNN is used to classify data points into categories, we evaluate how well it predicts the correct class. Common metrics:
A. Accuracy#
Simple and intuitive.
Works well when classes are balanced.
B. Confusion Matrix#
A table showing predicted vs actual labels:
Actual \ Predicted |
Class 1 |
Class 2 |
Class 3 |
|---|---|---|---|
Class 1 |
TP |
FN |
FN |
Class 2 |
FP |
TP |
FN |
Class 3 |
FP |
FN |
TP |
TP = True Positive, FP = False Positive, etc.
Helps compute other metrics like precision and recall.
C. Precision#
Of all points predicted as class X, how many are correct?
D. Recall (Sensitivity)#
Of all points actually in class X, how many did we predict correctly?
E. F1 Score#
Harmonic mean of precision and recall.
Useful when classes are imbalanced.
2. Regression Metrics#
When KNN predicts continuous values:
A. Mean Squared Error (MSE)#
Measures average squared difference between true and predicted values.
B. Root Mean Squared Error (RMSE)#
Same units as the target variable, easier to interpret.
C. Mean Absolute Error (MAE)#
Average absolute difference. Less sensitive to outliers than MSE.
D. R² Score (Coefficient of Determination)#
Measures how much variance in the target is explained by the model.
Range: 0–1 (higher is better).
3. KNN-Specific Considerations#
Choice of k strongly affects performance:
Small k → may overfit → high variance
Large k → may underfit → high bias
Distance metric affects how neighbors are chosen, impacting metrics.
Scaling features is crucial, otherwise one feature may dominate distances → poor performance.
4. Quick Summary Table#
Task |
Metric |
What it Measures |
|---|---|---|
Classification |
Accuracy |
Overall correct predictions |
Precision |
Correct positive predictions |
|
Recall |
Coverage of actual positives |
|
F1 Score |
Balance of precision & recall |
|
Confusion Matrix |
Detailed correct/misclassification |
|
Regression |
MSE / RMSE |
Average squared error |
MAE |
Average absolute error |
|
R² |
Variance explained |