Polynomial and Interaction Terms: Linear Regression

Machine Learning Interview Guide for Top Tech Roles (2025)

Linear Regression: Complete Interview Guide for Interviews

3 min read 483 words

🎯 Core Idea

Polynomial and interaction terms are feature engineering techniques that allow linear regression to model non-linear relationships and feature interactions. While they increase flexibility, they can also introduce risks like overfitting and multicollinearity, requiring careful use of regularization and validation.

🌱 Intuition & Real-World Analogy

Polynomial Terms: Imagine trying to fit a straight stick (linear regression) onto a winding road. The stick won’t align well. By allowing the stick to bend (quadratic, cubic terms), you can follow the road’s curves.
Interaction Terms: Think of making a cake. Flour and sugar individually matter, but when combined, they produce effects you can’t explain by looking at each alone. Interaction terms capture this “combined effect” of features.

In short:

Polynomials = bendy stick to capture curves.
Interactions = mixing ingredients to capture combined effects.

📐 Mathematical Foundation

Polynomial Terms Extend features by powers of themselves:
$$ y = \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 + \dots + \beta_d x^d + \epsilon $$
- $x^d$: higher-order polynomial features.
- $d$: degree of the polynomial.
- As $d$ increases, the curve becomes more flexible but also more unstable.
Interaction Terms Capture combined effects:
$$ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 (x_1 \cdot x_2) + \epsilon $$
- $x_1 \cdot x_2$: interaction feature.
- Extensible to higher-order interactions: $x_1^2 \cdot x_2$, etc.
Assumptions
- Still assumes linear relationship in parameters (the model is linear in $\beta$).
- Error term $\epsilon$ must satisfy standard regression assumptions (independence, homoscedasticity, etc.).

Deep Dive: Why This Is Still Linear Regression

Even though we add $x^2$ or $x_1 \cdot x_2$, the regression is still linear in coefficients. That’s why it’s still called linear regression with polynomial features, not “non-linear regression.”

⚖️ Strengths, Limitations & Trade-offs

✅ Strengths

Can model non-linear relationships without complex models.
Interaction terms capture dependencies between features.
Still interpretable compared to black-box models.

❌ Limitations

Higher-degree polynomials lead to overfitting.
Multicollinearity is common (e.g., $x, x^2, x^3$ are highly correlated).
Model interpretability decreases with higher-order terms.
Extrapolation outside training data becomes unstable.

⚖️ Trade-offs

Bias vs Variance: Low-degree → high bias; high-degree → high variance.
Regularization (Ridge/Lasso) often required.

🔍 Variants & Extensions

Polynomial Regression: Standard regression with polynomial features.
Orthogonal Polynomials: Reduce multicollinearity by using basis transformations.
Spline Regression: Piecewise polynomials with continuity constraints (more stable).
Generalized Additive Models (GAMs): Extend polynomials with smooth non-linear functions.

🚧 Common Challenges & Pitfalls

Overfitting: Candidate may think higher-degree polynomials always improve fit. Reality: they worsen generalization.
Multicollinearity: Polynomial and interaction terms inflate VIF (Variance Inflation Factor).
Interpretability Trap: $\beta_2$ for $x^2$ doesn’t mean “squared effect” directly; interpretation becomes messy.
Extrapolation Risk: Outside training data, polynomials behave wildly (e.g., cubic trends exploding to ±∞).

📚 Reference Pointers

The Elements of Statistical Learning (Hastie, Tibshirani, Friedman) – Chapter on basis expansions.
Wikipedia: Polynomial Regression
Wikipedia: Interaction (Statistics)
Koller & Friedman’s PGM text – for deeper assumptions on linearity in parameters.

R-squared and Adjusted R-squared: Linear Regression p-values and Confidence Intervals: Linear Regression