Assumptions

Assumptions #

Non-parametric means it doesn’t assume a specific form of the relationship between features and target (no linearity assumption).
It can model complex nonlinear relationships naturally.

Implication: You don’t need to transform features to fit a line or polynomial; trees handle splits automatically.

While RF is flexible, it still makes a few practical assumptions:

Random Forest assumes that training samples are independent.
Correlated or time-dependent samples (like time series) may require special handling.

Example:

In stock price prediction, consecutive days are correlated → standard RF may ignore temporal dependency.

Random Forest works best if at least some features are informative.
Including completely irrelevant features usually won’t hurt too much because RF selects random subsets, but too many noisy features may reduce performance.

Training data should be representative of the population you want to predict.
Bagging (bootstrap sampling) assumes each sample is drawn from the same underlying distribution.

RF splits nodes using measures like:
- Gini Impurity or Entropy (classification)
- Variance reduction / MSE (regression)
This assumes that splitting features can actually reduce impurity.
If all features are weak or unrelated, RF won’t perform well.

RF is robust to outliers, missing values (some implementations), and feature correlations.
Correlated features reduce the diversity among trees, slightly decreasing the benefit of ensembling.

Aspect	Assumption?	Notes
Feature-target relationship	No (non-parametric)	Can capture nonlinear patterns
Observation independence	Yes	Samples should be independent
Feature informativeness	Yes (some features must help)	Random feature selection mitigates irrelevant features
Data representativeness	Yes	Training data should reflect population
Scaling / normality	No	RF is scale-invariant and distribution-free