Hyperparameter Tuning#
What are Hyperparameters#
In machine learning, hyperparameters are settings you define before training a model.
They are not learned from data (unlike model weights/coefficients).
Proper tuning of hyperparameters can improve model performance and prevent overfitting/underfitting.
Key Hyperparameters in Logistic Regression#
Regularization Parameter (C)
Definition: Controls the strength of regularization (penalty for large coefficients).
In scikit-learn,
Cis the inverse of regularization strength:
Smaller
C→ stronger regularization → reduces overfitting.Larger
C→ weaker regularization → may overfit on training data.Regularization helps prevent the model from giving too much weight to a single feature.
Penalty Type (
penalty)
Determines the type of regularization used. Common options:
'l2'→ Ridge regularization (squared magnitude of coefficients)'l1'→ Lasso regularization (absolute value of coefficients, encourages sparsity)'elasticnet'→ Combination of L1 and L2 (requiressolver='saga')
Solver (
solver)
Optimization algorithm used to fit the model. Common options:
'lbfgs'→ Good default for small datasets, supports L2'liblinear'→ Good for small datasets, supports L1'saga'→ Supports L1, L2, and elasticnet, scalable for large datasets
Choice of solver may depend on dataset size and penalty type.
Maximum Iterations (
max_iter)
Maximum number of iterations for the solver to converge.
Default is usually 100; if the model does not converge, you can increase this number.
Class Weight (
class_weight)
Used for imbalanced datasets.
Options:
None→ all classes are treated equally'balanced'→ weights inversely proportional to class frequency
3. How to Tune Hyperparameters#
Grid Search
Try all possible combinations of hyperparameters in a predefined grid.
Example: tune
C = [0.01, 0.1, 1, 10]andpenalty = ['l1','l2'].
Randomized Search
Randomly select hyperparameter combinations, useful when the grid is large.
Cross-Validation
Split training data into multiple folds.
Evaluate hyperparameter combinations using average validation performance.
Prevents overfitting to a single train-test split.
4. Example in Python#
Summary
Hyperparameter tuning improves model performance.
Important hyperparameters in Logistic Regression:
C→ regularization strengthpenalty→ L1/L2solver→ optimization algorithmclass_weight→ handle imbalanced datasets
Use GridSearchCV or RandomizedSearchCV with cross-validation to find the best combination.
# Import libraries
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import warnings
warnings.filterwarnings("ignore")
# Load Iris dataset (multi-class example)
data = load_iris()
X = data.data
y = data.target
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create logistic regression model
logreg = LogisticRegression(multi_class='ovr', solver='liblinear', max_iter=500)
# Define hyperparameter grid
param_grid = {
'C': [0.01, 0.1, 1, 10], # Regularization strength
'penalty': ['l1', 'l2'], # L1 or L2 regularization
'class_weight': [None, 'balanced'] # Handle imbalanced data
}
# Perform grid search with 5-fold cross-validation
grid_search = GridSearchCV(estimator=logreg, param_grid=param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)
# Print best hyperparameters and score
print("Best Hyperparameters:", grid_search.best_params_)
print("Best Cross-Validated Accuracy:", grid_search.best_score_)
# Test set evaluation
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)
print("Test Set Accuracy:", accuracy_score(y_test, y_pred))
Best Hyperparameters: {'C': 1, 'class_weight': 'balanced', 'penalty': 'l2'}
Best Cross-Validated Accuracy: 0.9523809523809523
Test Set Accuracy: 0.9777777777777777
# Demonstration: Logistic Regression on an imbalanced dataset using StratifiedKFold
# Import libraries
import numpy as np
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import StratifiedKFold, cross_val_score
from sklearn.metrics import make_scorer, f1_score, precision_score, recall_score
# Step 1: Create imbalanced dataset
X, y = make_classification(n_samples=1000, n_features=5,
n_informative=3, n_redundant=0, n_classes=2,
weights=[0.9, 0.1], random_state=42)
# Step 2: Create Logistic Regression model with class_weight='balanced'
model = LogisticRegression(solver='liblinear', class_weight='balanced')
# Step 3: Define StratifiedKFold
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
# Step 4: Evaluate using cross-validation with F1-score, Precision, Recall
f1_scores = cross_val_score(model, X, y, cv=skf, scoring=make_scorer(f1_score))
precision_scores = cross_val_score(model, X, y, cv=skf, scoring=make_scorer(precision_score))
recall_scores = cross_val_score(model, X, y, cv=skf, scoring=make_scorer(recall_score))
# Step 5: Print results
print("F1-scores for each fold:", f1_scores)
print("Mean F1-score:", f1_scores.mean())
print("\nPrecision for each fold:", precision_scores)
print("Mean Precision:", precision_scores.mean())
print("\nRecall for each fold:", recall_scores)
print("Mean Recall:", recall_scores.mean())
F1-scores for each fold: [0.61016949 0.64864865 0.7037037 0.74509804 0.6557377 ]
Mean F1-score: 0.672671517602299
Precision for each fold: [0.47368421 0.75 0.57575758 0.63333333 0.51282051]
Mean Precision: 0.5891191264875475
Recall for each fold: [0.85714286 0.57142857 0.9047619 0.9047619 0.90909091]
Mean Recall: 0.8294372294372293