Workflows#
Data Preparation
Collect labeled data \((x_i, y_i)\) with \(y_i \in \{-1, +1\}\) for binary classification.
Scale/normalize features since SVM relies on distance measures.
Handle imbalance (e.g., stratified sampling, class weights).
Model Selection
Choose linear SVM if data is (almost) linearly separable.
Use kernel SVM (RBF, polynomial) for non-linear patterns.
Decide between hard margin (large C) or soft margin (small C) depending on noise.
Training (Optimization)
Solve convex optimization:
\[ \min_{w,b,\xi} \; \frac{1}{2}\|w\|^2 + C\sum_i \xi_i \]subject to margin constraints.
Identify support vectors, points that lie on the margin or violate it.
Prediction
Compute decision function:
\[ f(x) = \text{sign}\Big(\sum_{i} \alpha_i y_i K(x_i, x) + b\Big) \]Predict class based on sign of the decision value.
Model Evaluation
Use cross-validation to estimate performance.
Metrics: accuracy, precision, recall, F1, ROC-AUC depending on task.
Hyperparameter Tuning
Parameters:
C: margin vs. misclassification trade-off.
γ (gamma): influence of single points in RBF.
degree: for polynomial kernels.
Tune via GridSearchCV or RandomizedSearchCV.
Deployment
Use trained model to classify unseen data.
Re-train periodically if data distribution shifts.
Demonstration#
# Re-run after reset
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV, StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import pandas as pd
# 1. Data Preparation
X, y = make_classification(n_samples=500, n_features=5, n_informative=3, n_redundant=0,
n_classes=2, weights=[0.6, 0.4], random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
stratify=y, random_state=42)
# Feature scaling (important for SVM)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# 2. Baseline Model (Linear kernel)
svm_linear = SVC(kernel="linear", C=1, random_state=42)
svm_linear.fit(X_train_scaled, y_train)
y_pred_baseline = svm_linear.predict(X_test_scaled)
baseline_acc = accuracy_score(y_test, y_pred_baseline)
# 3. Hyperparameter Tuning with RBF kernel
param_grid = {
"C": [0.1, 1, 10],
"gamma": [0.01, 0.1, 1]
}
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
grid = GridSearchCV(SVC(kernel="rbf"), param_grid, cv=cv, scoring="accuracy", n_jobs=-1)
grid.fit(X_train_scaled, y_train)
best_svm = grid.best_estimator_
y_pred_best = best_svm.predict(X_test_scaled)
best_acc = accuracy_score(y_test, y_pred_best)
# 4. Evaluation results
baseline_report = classification_report(y_test, y_pred_baseline)
best_report = classification_report(y_test, y_pred_best)
cm_baseline = confusion_matrix(y_test, y_pred_baseline)
cm_best = confusion_matrix(y_test, y_pred_best)
results_df = pd.DataFrame({
"Model": ["Linear SVM", "Best RBF SVM"],
"Accuracy": [baseline_acc, best_acc],
"Best Params": [None, grid.best_params_]
})
print("Baseline Linear SVM Classification Report:\n", baseline_report)
print("Best RBF SVM Classification Report:\n", best_report)
print("Confusion Matrix (Baseline):\n", cm_baseline)
print("Confusion Matrix (Best RBF):\n", cm_best)
results_df.head()
Baseline Linear SVM Classification Report:
precision recall f1-score support
0 0.87 0.94 0.90 89
1 0.91 0.79 0.84 61
accuracy 0.88 150
macro avg 0.89 0.87 0.87 150
weighted avg 0.88 0.88 0.88 150
Best RBF SVM Classification Report:
precision recall f1-score support
0 0.92 0.99 0.95 89
1 0.98 0.87 0.92 61
accuracy 0.94 150
macro avg 0.95 0.93 0.94 150
weighted avg 0.94 0.94 0.94 150
Confusion Matrix (Baseline):
[[84 5]
[13 48]]
Confusion Matrix (Best RBF):
[[88 1]
[ 8 53]]
| Model | Accuracy | Best Params | |
|---|---|---|---|
| 0 | Linear SVM | 0.88 | None |
| 1 | Best RBF SVM | 0.94 | {'C': 10, 'gamma': 0.1} |