Cost Function

Cost Function #

\[ \max_{w,b} \frac{2}{\|w\|} \]

\[ \min_{w,b} \frac{1}{2}\|w\|^2 \]

Constraints:

\[ y_i (w^T x_i + b) \geq 1, \quad \forall i \]

This means each correctly classified point lies outside or on the margin.

Real data are noisy and not perfectly separable. So we add slack variables \(\xi_i \geq 0\):

\[ y_i (w^T x_i + b) \geq 1 - \xi_i \]

\[ \min_{w,b,\xi} \frac{1}{2}\|w\|^2 + C \sum_{i=1}^n \xi_i \]

First term: keeps the margin large.
Second term: penalizes violations (misclassified or margin-crossing points).
\(C\): hyperparameter that controls tradeoff:
- Large \(C\): less tolerance for violations → narrower margin.
- Small \(C\): more tolerance → wider margin, better generalization.

The penalty for each point is given by hinge loss:

\[ L_i = \max(0, 1 - y_i (w^T x_i + b)) \]

So the full objective becomes:

\[ \min_{w,b} \frac{1}{2}\|w\|^2 + C \sum_{i=1}^n \max(0, 1 - y_i (w^T x_i + b)) \]