Assumptions#
Data is (approximately) separable
SVM assumes that classes can be separated by a hyperplane, either in original space (linear) or transformed space (non-linear with kernels).
Large-margin principle
Assumes that the best decision boundary is the one that maximizes margin between classes.
Kernel appropriateness
If data is non-linear, assumes the chosen kernel (RBF, polynomial, etc.) maps data to a space where separation is possible.
Independent and identically distributed (i.i.d.) data
Training and test samples come from the same distribution and are independent.
Balanced scaling of features
Assumes input features are normalized/scaled, since SVM relies on distance calculations.
Limited noise and outliers
Assumes data is not heavily noisy, since outliers close to the margin can affect the boundary.