Hyperparameter Tuning methods#
There are several ways to search for the best hyperparameters when tuning machine learning models. Here are the main types:
Types of Hyperparameter Search#
1. Manual Search#
Try parameters by hand based on intuition or domain knowledge.
Example: test
α = 0.1, 1, 10for Ridge.✅ Simple, but ❌ inefficient and may miss optimal values.
2. Grid Search#
Define a grid of hyperparameter values.
Try all combinations exhaustively with cross-validation.
Example:
alpha = [0.01, 0.1, 1, 10, 100] l1_ratio = [0.1, 0.5, 0.9]
→ Tests 5 × 3 = 15 combinations.
✅ Systematic, guarantees best within grid.
❌ Expensive if grid is large.
3. Random Search#
Instead of testing all values, randomly sample combinations from given distributions.
Example:
alpha ∼ Uniform(0.001, 100)✅ More efficient than grid, can cover large spaces.
❌ May miss exact optimal if unlucky.
4. Bayesian Optimization#
Uses past evaluation results to model performance as a probability distribution.
Chooses new hyperparameters that are most promising.
✅ Finds optimal faster than grid/random.
❌ More complex, needs specialized libraries (
optuna,scikit-optimize,hyperopt).
5. Gradient-Based Optimization (advanced)#
Uses gradients of the loss with respect to hyperparameters.
Works mainly for continuous hyperparameters.
Rare in practice because many hyperparameters (like
max_depth) are discrete.
6. Evolutionary / Genetic Algorithms#
Treat hyperparameters like genes.
Randomly mutate and crossover values across generations.
✅ Can escape local optima.
❌ Slower, harder to tune.
7. Successive Halving / Hyperband#
Start with many random hyperparameter sets.
Train each briefly.
Discard poorly performing ones early, keep only the best for longer training.
✅ Efficient, reduces wasted computation.
Summary Table
Method |
Strategy |
Pros |
Cons |
|---|---|---|---|
Manual Search |
Trial-and-error |
Simple |
Not systematic |
Grid Search |
Exhaustive combinations |
Guaranteed best in grid |
Expensive |
Random Search |
Random sampling |
Efficient, scalable |
No guarantee |
Bayesian Optimization |
Probabilistic model-guided search |
Fast convergence |
Complex |
Gradient-Based |
Gradient descent on hyperparams |
Precise for continuous vars |
Rarely practical |
Evolutionary Algorithms |
Mutation + crossover |
Escapes local optima |
Slow |
Hyperband / Successive Halving |
Early stopping bad configs |
Saves compute |
Needs careful setup |
👉 In practice:
For small problems → Grid Search.
For large spaces → Random Search or Hyperband.
For serious optimization → Bayesian Optimization (e.g.,
Optuna).
import numpy as np
from sklearn.datasets import make_regression
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV, train_test_split
from sklearn.metrics import r2_score
# Generate synthetic regression dataset
X, y = make_regression(n_samples=200, n_features=10, noise=15, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# ---------------- Grid Search ----------------
ridge = Ridge()
param_grid = {'alpha': [0.01, 0.1, 1, 10, 100, 1000]} # exhaustive list
grid_search = GridSearchCV(ridge, param_grid, cv=5, scoring='r2')
grid_search.fit(X_train, y_train)
print("GridSearchCV best params:", grid_search.best_params_)
print("GridSearchCV best CV score:", grid_search.best_score_)
# Evaluate on test data
y_pred_grid = grid_search.best_estimator_.predict(X_test)
print("GridSearchCV test R2:", r2_score(y_test, y_pred_grid))
# ---------------- Random Search ----------------
param_dist = {'alpha': np.logspace(-3, 3, 100)} # random sampling from wide range
random_search = RandomizedSearchCV(ridge, param_dist, n_iter=10, cv=5, scoring='r2', random_state=42)
random_search.fit(X_train, y_train)
print("\nRandomizedSearchCV best params:", random_search.best_params_)
print("RandomizedSearchCV best CV score:", random_search.best_score_)
# Evaluate on test data
y_pred_rand = random_search.best_estimator_.predict(X_test)
print("RandomizedSearchCV test R2:", r2_score(y_test, y_pred_rand))
GridSearchCV best params: {'alpha': 0.1}
GridSearchCV best CV score: 0.9907142472562647
GridSearchCV test R2: 0.9934316711441261
RandomizedSearchCV best params: {'alpha': 0.021544346900318846}
RandomizedSearchCV best CV score: 0.9907141636217599
RandomizedSearchCV test R2: 0.9934566226827182
The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click <a href='https://aka.ms/vscodeJupyterKernelCrash'>here</a> for more info.
View Jupyter <a href='command:jupyter.viewOutput'>log</a> for further details.