Hyperparameter Tuning#
KNN has no training phase in the usual sense, but it still has hyperparameters that control how predictions are made:
k(number of neighbors) – how many nearby points influence the predictionDistance metric – how we measure “closeness” between points
Euclidean, Manhattan, Minkowski, Hamming, etc.
Weights of neighbors – whether all neighbors contribute equally or closer ones count more
uniform: all neighbors have equal weightdistance: closer neighbors have higher influence
Hyperparameter tuning is the process of finding the combination of these hyperparameters that minimizes error (or maximizes accuracy).
2. Key Hyperparameters#
Hyperparameter |
Description |
Effect on model |
|---|---|---|
|
Number of nearest neighbors |
Small k → noisy predictions, overfit |
|
Distance metric to calculate closeness |
Changes neighbor selection → affects predictions |
|
Weighting of neighbors |
Can improve performance by prioritizing closer points |
3. Methods for Hyperparameter Tuning#
A. Manual Search#
Try different values of
k(e.g., 1,3,5,…,15)Evaluate performance on a validation set
Choose the
kgiving the best accuracy / lowest error
B. Grid Search#
Explore a grid of hyperparameter combinations:
from sklearn.model_selection import GridSearchCV
from sklearn.neighbors import KNeighborsClassifier
param_grid = {
'n_neighbors': [3,5,7,9],
'weights': ['uniform', 'distance'],
'metric': ['euclidean', 'manhattan']
}
knn = KNeighborsClassifier()
grid = GridSearchCV(knn, param_grid, cv=5, scoring='accuracy')
grid.fit(X_train, y_train)
print("Best hyperparameters:", grid.best_params_)
print("Best CV accuracy:", grid.best_score_)
C. Randomized Search#
Instead of trying all combinations, sample random combinations (faster for large grids)
Works similarly to GridSearchCV but more efficient.
D. Cross-Validation#
Always combine tuning with cross-validation to avoid overfitting
Use k-fold CV (e.g., k=5) to evaluate each hyperparameter setting.
4. Intuition Behind Tuning k#
Small
k:Captures local patterns
Sensitive to noise → may misclassify outliers
Large
k:Smooths predictions
Ignores local patterns → may underfit
Optimal k is usually found by experimenting with validation scores or silhouette scores (for clustering).
5. Weighted vs Uniform Neighbors#
Uniform: all neighbors contribute equally
Distance: closer neighbors contribute more → often improves accuracy in noisy datasets
Intuition: nearer neighbors are more likely to be similar, so weighting helps KNN “trust” the right points.
6. Summary Workflow#
Choose a range of hyperparameters (
k,metric,weights)Split data (train/validation or use cross-validation)
Evaluate performance for each combination
Select best combination
Retrain KNN on the full training set with these hyperparameters