Underfitting & Overfitting in ML

Machine Learning · Deep Dive

Underfitting &
Overfitting

The two fundamental forces every ML model must balance — and why getting it right changes everything.

7 min read ML Fundamentals

SCROLL

● Overview

Why Models Fail

Machine Learning models have one job: learn patterns from data and make accurate predictions on new, unseen data. Simple in theory — but two silent enemies lurk in every training run.

High Bias High Variance Bias-Variance Tradeoff Generalization

When a model learns too little, we call it underfitting. When it learns too much — including noise and random quirks — we call it overfitting. The art is finding the sweet spot between the two.

● Underfitting

When the Model Knows Too Little

Underfitting occurs when a model is too simple to capture the real patterns in the data. It performs poorly on both training and testing data — it hasn't learned enough to be useful.

💡 Real-world example

Predicting Temperature Over the Day

Temperature rises in the morning, peaks in the afternoon, then falls — a curve. But if your model forces a straight line, it can never capture that rise-and-fall. The result? Systematically wrong predictions at every point.

TEMPERATURE vs TIME OF DAY — UNDERFITTING

Causes of Underfitting

Model too simple High regularization Weak features Not enough training High bias

● Overfitting

When the Model Knows Too Much

Overfitting happens when a model becomes so complex it memorizes the training data — noise and all. It aces training, but collapses on real-world data it hasn't seen before.

💡 Real-world example

Predicting Shop Sales

A complex model tries to match every spike and drop in daily sales data — treating random fluctuations as meaningful patterns. It gets a perfect score on training data, but fails completely to predict next week's sales.

SALES vs DAYS — OVERFITTING

● Actual Sales ── Overfitting Curve ── True Trend

Causes of Overfitting

Model too complex Too many features Very little data No regularization High variance

● Bias-Variance Tradeoff

Side by Side

Understanding both problems together reveals the core tension in machine learning — known as the bias-variance tradeoff.

🔴 Underfitting

Model too simple
High bias
Low variance
Bad on train & test
Misses real patterns

🔵 Overfitting

Model too complex
Low bias
High variance
Great on train, bad on test
Memorizes noise

UNDERFITTING

Bias

Variance

High Bias

OVERFITTING

Bias

Variance

High Variance

PERFECT FIT

Bias

Variance

Balanced ✓

y = ax² + bx + c

A BALANCED QUADRATIC MODEL — COMPLEX ENOUGH, SIMPLE ENOUGH

This concept is the bias-variance tradeoff: underfitting gives you high bias and low variance, overfitting gives you low bias and high variance. The ideal model balances both.

● Solutions

How to Fix Both Problems

Each fix targets a specific root cause — hover any card to see it glow, and watch the live mini-chart update as you toggle between fixes.

🔴 Fix Underfitting

Complexity ↑

🧠

Use a More Complex Model

Switch to higher-degree polynomials, deeper neural nets, or ensemble methods.

Features ↑

🔬

Add Relevant Features

Engineer new inputs that give the model richer information to work with.

adding features

Epochs ↑

⏱️

Increase Training Time

Allow more epochs so the model converges on the real underlying pattern.

epoch

Regularization ↓

🎛️

Reduce Regularization

Loosen constraints that are too tight, letting the model learn more freely.

🔵 Fix Overfitting

L1 / L2

⚖️

Regularization (L1 / L2)

Penalize large weights to stop the model from memorizing noise.

Data ↑

📊

Increase Training Data

Diverse data teaches real patterns rather than random training quirks.

loading data

k-Fold CV

🔄

Cross-Validation

K-fold splits verify consistent performance across all data slices.

Complexity ↓

✂️

Simplify the Model

Trim layers or features so the model can't over-memorize training samples.

pruning nodes…

Stop Before Overfit

🛑

Early Stopping

Monitor validation loss during training. The moment it starts climbing, stop — you've hit the sweet spot before memorization kicks in.

📈 Live Training vs Validation Loss — Before & After Fix

── Training Loss ── Validation Loss

● Conclusion

"A model that has truly learned doesn't just memorize — it understands."

Underfitting and overfitting are two sides of the same coin. Master the bias-variance tradeoff through regularization, feature selection, and cross-validation — and your models will generalize to the real world.

Search This Blog

Overfitting vs Underfitting: Causes, Effects, and Solutions in Machine Learning

Underfitting &
Overfitting

Why Models Fail

When the Model Knows Too Little

Predicting Temperature Over the Day

Causes of Underfitting

When the Model Knows Too Much

Predicting Shop Sales

Causes of Overfitting

Side by Side

🔴 Underfitting

🔵 Overfitting

How to Fix Both Problems

🔴 Fix Underfitting

Use a More Complex Model

Add Relevant Features

Increase Training Time

Reduce Regularization

🔵 Fix Overfitting

Regularization (L1 / L2)

Increase Training Data

Cross-Validation

Simplify the Model

Early Stopping

Comments

Post a Comment

Popular posts from this blog