Elastic Net Regression#
Elastic Net Regression is a regularized linear regression technique that combines the penalties of:
Lasso (L1) → drives some coefficients to zero (feature selection).
Ridge (L2) → shrinks coefficients (handles multicollinearity).
It is especially useful when you have many correlated features.
Elastic Net Loss Function#
For a regression model:
The Elastic Net objective function is:
Where:
\(y_i\) → actual value
\(\hat{y}_i\) → predicted value
\(\beta_j\) → regression coefficients
\(\lambda\) → overall regularization strength
\(\alpha\) → mixing parameter between L1 and L2
If \(\alpha = 1\): becomes Lasso
If \(\alpha = 0\): becomes Ridge
Why Use Elastic Net?#
Lasso issue: If features are highly correlated, it tends to pick one and ignore others → unstable.
Ridge issue: Keeps all features but doesn’t perform feature selection.
Elastic Net: Combines both strengths → ✅ Keeps model stable with correlated features. ✅ Performs feature selection.
Example Intuition#
Suppose you’re predicting house price using:
square_feetbedroomsbathroomslocation_score
Since square_feet and bedrooms are highly correlated:
Lasso may drop
bedroomsentirely.Ridge will keep both but shrink their weights.
Elastic Net → keeps both but controls their weights → better balance.
Pros & Cons#
✅ Handles multicollinearity ✅ Performs feature selection ✅ Works well when \(p > n\) (more features than samples) ❌ Needs tuning of two hyperparameters (\(\lambda, \alpha\))