Hyperparameter Tuning#

Whether you use AdaBoostClassifier or AdaBoostRegressor, the core hyperparameters are mostly the same:

1. n_estimators (Number of weak learners)#

  • Defines how many weak learners (stumps/trees) to use.

  • Too small → underfitting (model too simple).

  • Too large → may overfit, slower training.

  • Default = 50 (often needs tuning, e.g., 50–500).


2. learning_rate (Shrinkage factor)#

  • Controls how much each weak learner contributes.

  • Acts as a regularization parameter:

    • High value (e.g., 1.0) → faster learning, risk of overfitting.

    • Low value (e.g., 0.01–0.1) → slower, more robust learning, but may require higher n_estimators.

  • Tradeoff between n_estimators and learning_rate.


3. estimator (Base learner)#

  • By default, DecisionTreeClassifier(max_depth=1) for classification (decision stumps).

  • For regression → DecisionTreeRegressor(max_depth=3) is common.

  • You can tune:

    • max_depth → deeper trees capture more complex patterns but may overfit.

    • min_samples_split, min_samples_leaf → regularization to avoid overfitting.


4. loss (for AdaBoost Regressor only)#

  • Options: "linear", "square", "exponential".

  • Linear → penalizes errors proportionally.

  • Square → penalizes large errors more heavily.

  • Exponential → increases weight on hard-to-predict points aggressively (can overfit).


Hyperparameter Tuning Strategy#

Step 1: Start with Defaults#

  • n_estimators=50, learning_rate=1.0, base_estimator=DecisionTree(max_depth=1 or 3).

Step 2: Tune n_estimators and learning_rate#

  • Grid search over ranges like:

    n_estimators = [50, 100, 200, 500]
    learning_rate = [0.01, 0.05, 0.1, 0.5, 1.0]
    
  • Find best tradeoff: high enough to capture complexity, but not overfitting.

Step 3: Tune base_estimator depth#

  • Try max_depth = [1, 2, 3, 5].

  • For simple datasets, shallow trees (depth=1). For complex, deeper trees.

Step 4: For regression, tune loss#

  • Compare "linear", "square", "exponential" based on metrics like MSE or R².

Step 5: Use Cross-Validation#

  • Perform GridSearchCV or RandomizedSearchCV with cross-validation to avoid overfitting to a single train-test split.


Handling Overfitting & Underfitting#

Overfitting (model too complex)#

  • Reduce n_estimators.

  • Reduce max_depth of base learner.

  • Lower learning_rate.

  • Use "linear" loss for regression.

Underfitting (model too simple)#

  • Increase n_estimators.

  • Increase max_depth of base learner.

  • Increase learning_rate.

  • Try "square" or "exponential" loss for regression.


In short:

  • Tune n_estimators + learning_rate first (they control learning strength).

  • Tune base learner complexity (max_depth).

  • For regression, also experiment with loss function.

  • Use cross-validation to avoid overfitting.

import numpy as np
from sklearn.datasets import make_classification, make_regression
from sklearn.ensemble import AdaBoostClassifier, AdaBoostRegressor
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor
from sklearn.model_selection import GridSearchCV, train_test_split

# -------------------------------
# PART 1: AdaBoost Classifier
# -------------------------------
# Create classification dataset
X_cls, y_cls = make_classification(
    n_samples=500, n_features=10, n_informative=5, n_redundant=2,
    random_state=42
)
Xc_train, Xc_test, yc_train, yc_test = train_test_split(X_cls, y_cls, test_size=0.3, random_state=42)

# Grid search for classifier
param_grid_cls = {
    "n_estimators": [50, 100, 200],
    "learning_rate": [0.01, 0.1, 0.5, 1.0],
    "estimator": [DecisionTreeClassifier(max_depth=1),
                  DecisionTreeClassifier(max_depth=2)]
}

grid_cls = GridSearchCV(
    AdaBoostClassifier(random_state=42),
    param_grid_cls,
    cv=5, scoring="accuracy"
)
grid_cls.fit(Xc_train, yc_train)

print("🔹 AdaBoost Classifier Best Params:", grid_cls.best_params_)
print("🔹 Best CV Accuracy:", grid_cls.best_score_)
print("🔹 Test Accuracy:", grid_cls.score(Xc_test, yc_test))

# -------------------------------
# PART 2: AdaBoost Regressor
# -------------------------------
# Create regression dataset
X_reg, y_reg = make_regression(
    n_samples=500, n_features=10, noise=15.0, random_state=42
)
Xr_train, Xr_test, yr_train, yr_test = train_test_split(X_reg, y_reg, test_size=0.3, random_state=42)

# Grid search for regressor
param_grid_reg = {
    "n_estimators": [50, 100, 200],
    "learning_rate": [0.01, 0.1, 0.5, 1.0],
    "estimator": [DecisionTreeRegressor(max_depth=2),
                  DecisionTreeRegressor(max_depth=3)],
    "loss": ["linear", "square", "exponential"]
}

grid_reg = GridSearchCV(
    AdaBoostRegressor(random_state=42),
    param_grid_reg,
    cv=5, scoring="r2"
)
grid_reg.fit(Xr_train, yr_train)

print("\n🔹 AdaBoost Regressor Best Params:", grid_reg.best_params_)
print("🔹 Best CV R²:", grid_reg.best_score_)
print("🔹 Test R²:", grid_reg.score(Xr_test, yr_test))
🔹 AdaBoost Classifier Best Params: {'estimator': DecisionTreeClassifier(max_depth=2), 'learning_rate': 1.0, 'n_estimators': 200}
🔹 Best CV Accuracy: 0.9200000000000002
🔹 Test Accuracy: 0.9066666666666666

🔹 AdaBoost Regressor Best Params: {'estimator': DecisionTreeRegressor(max_depth=3), 'learning_rate': 1.0, 'loss': 'square', 'n_estimators': 200}
🔹 Best CV R²: 0.7771367499875095
🔹 Test R²: 0.7601366118098305