Underfitting & Overfitting in ML

Machine Learning · Deep Dive

Underfitting and Overfitting in ML

Every machine learning model must find the sweet spot between learning too little and learning too much. Here's a clear, practical guide to understanding — and fixing — both problems.

📖 7 min read 🏷 ML Fundamentals 📅 2025

Overview

Why Models Fail

Machine learning models have one goal: learn patterns from training data and make accurate predictions on data they've never seen. Simple in theory — but two problems can silently undermine even well-designed models.

When a model learns too little, it misses the real patterns in the data — this is underfitting. When it learns too much, it memorises the training data noise and all, and collapses on new data — this is overfitting. The ideal model sits right between the two.

Underfitting Overfitting Bias-Variance Tradeoff Generalisation

Underfitting

When the Model Knows Too Little

Underfitting happens when a model is too simple to capture the real patterns in data. It performs poorly on both training and test data — it simply hasn't learned enough to be useful.

💡 Example

Predicting Temperature Over the Day

Temperature rises in the morning, peaks at noon, then falls — a clear curve. A model that fits a straight line will be wrong at nearly every hour of the day.

Temperature vs Time of Day — Underfitting

Actual Data

Underfitting Line

Ideal Curve

Common Causes

Model too simple Excessive regularisation Weak features Too few training epochs High bias

Overfitting

When the Model Knows Too Much

Overfitting occurs when a model becomes so complex it memorises the training data — including random noise and outliers. It aces training, but fails completely on data it hasn't seen before.

💡 Example

Predicting Shop Sales

A complex model chases every spike and dip in past daily sales, treating random fluctuations as real patterns. It fits training data perfectly but can't predict next week at all.

Sales vs Days — Overfitting

Actual Sales

True Trend

Overfitting Curve

Common Causes

Model too complex Too many features Very little training data No regularisation High variance

The Bias-Variance Tradeoff

Finding the Perfect Balance

The ideal model sits between both extremes. It's complex enough to learn real patterns, but simple enough not to memorise noise. This is the essence of the bias-variance tradeoff.

Property	🔴 Underfitting	🔵 Overfitting
Model complexity	Too simple	Too complex
Bias	High	Low
Variance	Low	High
Training accuracy	Poor	Very high
Test accuracy	Poor	Poor
Generalises?	No	No

Bias & Variance at a Glance

Underfitting — Bias

High

Underfitting — Variance

Low

Overfitting — Bias

Low

Overfitting — Variance

High

Ideal Model — Bias

Balanced ✓

Ideal Model — Variance

y = ax² + bx + c

A balanced quadratic — captures the curve without memorising noise

Solutions

How to Fix Both Problems

The right fix depends on which problem your model has. Here are the most effective techniques for each.

🔴 Fix Underfitting

🧠

Use a More Complex Model

Switch to a deeper network, higher-degree polynomial, or ensemble method like Random Forest.

🔬

Add Relevant Features

Engineer new inputs that give the model richer information to learn from.

⏱️

Increase Training Time

Allow more epochs so the model has time to converge on real patterns.

🎛️

Reduce Regularisation

If regularisation is too strong, relax it so the model can learn more freely.

🔵 Fix Overfitting

⚖️

Regularisation (L1 / L2)

Penalise large weights to stop the model from memorising noise.

📊

Increase Training Data

More diverse examples help the model generalise rather than memorise.

🔄

Cross-Validation

Use k-fold validation to verify performance is consistent across data splits.

✂️

Simplify the Model

Trim layers or features so the model can't over-memorise training data.

🛑

Early Stopping

Monitor validation loss during training and stop the moment it starts rising — that's when overfitting begins. One of the simplest and most effective techniques available.

Conclusion

"A model that has truly learned doesn't just memorise — it understands."

Underfitting and overfitting are the two fundamental challenges in machine learning. Master the bias-variance tradeoff through regularisation, better data, and cross-validation — and your models will generalise to the real world with confidence.

Search This Blog

Overfitting vs Underfitting: Causes, Effects, and Solutions in Machine Learning

Underfitting and Overfitting in ML

Why Models Fail

When the Model Knows Too Little

Predicting Temperature Over the Day

Common Causes

When the Model Knows Too Much

Predicting Shop Sales

Common Causes

Finding the Perfect Balance

Bias & Variance at a Glance

How to Fix Both Problems

🔴 Fix Underfitting

Use a More Complex Model

Add Relevant Features

Increase Training Time

Reduce Regularisation

🔵 Fix Overfitting

Regularisation (L1 / L2)

Increase Training Data

Cross-Validation

Simplify the Model

Early Stopping

Comments

Post a Comment

Popular posts from this blog