Assumptions

Assumptions #

1. Recursive Partitioning Can Capture the True Pattern#

The algorithm assumes the data can be separated into subgroups that are relatively homogeneous in terms of the target class.
Example: splitting on “Weather = Sunny” should meaningfully separate classes like “Play Tennis” vs “Don’t Play”.

2. Features Have Predictive Power#

At least some features must carry information about the target.
Otherwise, splits won’t reduce impurity, and the tree won’t learn meaningful patterns.

3. No Linear/Distributional Assumptions#

Unlike regression models, decision trees don’t assume linearity between features and target.
They don’t assume normality of features or equal variance across classes. ✅ This makes them non-parametric and flexible.

4. Features Are Independent for Splitting#

At each split, the algorithm treats features independently and chooses the “best” one.
It does not assume feature independence globally, but locally at a split it ignores feature interactions unless they show up in deeper splits.

5. Sufficient Data for Each Split#

Assumes there’s enough data in each node to compute reliable impurity measures (Gini/Entropy).
Small datasets can make trees unstable (high variance).

6. Target Variable is Well-Defined#

Assumes that the target classes are mutually exclusive and exhaustive.
Example: A loan application is either “Approved” or “Rejected”, not both.

Summary

What Trees DON’T assume: linearity, normality, equal variance, feature scaling.
What Trees DO assume:
- Recursive partitioning can separate data meaningfully.
- Some features are predictive.
- Enough samples exist per node to make good splits.

👉 This low number of assumptions is why Decision Trees work well in practice, especially when extended into ensembles (Random Forests, Gradient Boosting).

Assumptions

Contents

Assumptions#

1. Recursive Partitioning Can Capture the True Pattern#

2. Features Have Predictive Power#

3. No Linear/Distributional Assumptions#

4. Features Are Independent for Splitting#

5. Sufficient Data for Each Split#

6. Target Variable is Well-Defined#

Assumptions #