Assumptions#

1. Weak Learners Should Perform Slightly Better than Random#

  • The base learners (often shallow decision trees, called decision stumps) should have an accuracy just above random guessing:

    • For binary classification → slightly better than 50% accuracy.

  • Boosting works by combining many weak rules, so if each base learner is no better than chance, AdaBoost fails.


2. Additive Model of Errors#

  • AdaBoost assumes that errors from weak learners can be combined and corrected sequentially.

  • Misclassified samples get higher weights → future weak learners focus on them.

  • This assumes misclassification can be reduced step-by-step instead of being random noise.


3. Data is (Relatively) Clean#

  • AdaBoost is sensitive to noisy data and outliers, because:

    • Misclassified points get higher weights repeatedly.

    • Outliers that are impossible to classify correctly receive disproportionate focus.

  • Implicit assumption: dataset has low noise and few extreme outliers.


4. Feature Independence Isn’t Required (Unlike Naive Bayes)#

  • AdaBoost does not assume independence of features.

  • It can handle correlated features, but redundant features may make training inefficient.


5. Sufficient Number of Weak Learners#

  • Boosting assumes that with enough iterations (weak learners), the combined strong learner will converge to a low-error classifier.

  • Too few learners → underfitting; too many learners → risk of overfitting (though AdaBoost is surprisingly resistant to overfitting on clean data).


6. Weak Learners Should Be Simple#

  • Base learners should be simple (e.g., decision stumps or very shallow trees).

  • If base learners are too complex (deep trees), boosting loses meaning (becomes just an ensemble of strong models).


Summary

AdaBoost works best under these assumptions:

  • Weak learners perform slightly better than chance.

  • Errors can be sequentially corrected.

  • Data is relatively clean (not dominated by noise or outliers).

  • Enough learners are combined to reduce bias.

  • Base learners are simple and diverse.