How PostgreSQL Chooses a Query Plan (And Why It Gets It Wrong)

Ever run a query that was blazing fast yesterday… and painfully slow today — without changing a single line of SQL?

That isn’t PostgreSQL being random. It’s PostgreSQL doing exactly what it was designed to do.

To understand performance problems, you need to understand how PostgreSQL chooses a query plan — and why that choice is sometimes wrong.

The Planner Is Cost-Based (Not Rule-Based)

PostgreSQL doesn’t follow hard rules like “always use an index.” Instead, it uses a cost-based optimizer.

For every query, PostgreSQL:

Generates multiple possible execution plans
Estimates the cost of each plan
Chooses the cheapest one

That cost is not measured in milliseconds. It’s an internal unit based on:

Estimated rows processed
Disk I/O vs memory access
CPU work for joins, filters, and aggregates

The key word here is estimated.

Why Row Estimates Matter More Than Almost Anything

Nearly every bad plan starts with a bad row estimate.

PostgreSQL doesn’t know your data — it guesses based on statistics. If it thinks a filter returns 10 rows but actually returns 100,000, the chosen plan can be wildly inappropriate.

Example:


EXPLAIN
SELECT *
FROM rental
WHERE return_date IS NULL;

If PostgreSQL believes only a handful of rows match, it may choose an index scan.

If half the table actually matches, a sequential scan would be cheaper — but the planner doesn’t realize that.

PostgreSQL doesn’t adapt mid-query. Once the plan is chosen, it sticks with it.

Planning Time vs Execution Time

When you run EXPLAIN ANALYZE, you’ll see two important numbers:

Planning Time
Execution Time

Planning time is usually tiny — microseconds to a few milliseconds. PostgreSQL is optimizing, but not exhaustively.

Execution time is where bad estimates hurt. A plan that looked cheap on paper can be devastating in reality.

This is why a query can:

Have a fast plan but slow execution
Use an index and still perform poorly
Suddenly switch from index scan to seq scan

“It Worked Yesterday” Is Not a Contradiction

PostgreSQL query plans are not guaranteed to be stable forever.

Any of the following can change a plan — without touching the query:

Data volume growth
Data distribution changes
Vacuum or analyze timing
Statistics updates
Parameter values

That means:

“The same query with the same indexes can legitimately get a different plan tomorrow.”

PostgreSQL isn’t being inconsistent. It’s responding to a different statistical reality.

Why PostgreSQL Sometimes Chooses a Sequential Scan

Developers often panic when they see a sequential scan. That’s a mistake.

A sequential scan is often the correct choice when:

A large percentage of the table is read
The table fits in memory
The index would require random I/O

Indexes are not free. They’re only useful when they eliminate enough work.

The planner chooses a sequential scan because — based on estimates — it believes that’s cheaper.

EXPLAIN vs EXPLAIN ANALYZE

EXPLAIN shows what PostgreSQL thinks will happen.

EXPLAIN ANALYZE shows what actually happened.


EXPLAIN
SELECT *
FROM rental
WHERE return_date IS NULL;

This is where you spot problems:

Estimated rows vs actual rows
Unexpected sequential scans
Nested loops where hashes would be better

When estimates are off by orders of magnitude, the planner never had a chance.

The Real Enemy: Bad Statistics

PostgreSQL relies heavily on statistics gathered by ANALYZE. If those stats are stale or insufficient, the planner is flying blind.

This is why:

Freshly loaded tables behave strangely
Highly skewed data causes bad plans
Default statistics aren’t always enough

PostgreSQL won’t warn you when statistics are bad — but it will tell you when they were last updated.


SELECT
  relname AS table_name,
  last_analyze,
  last_autoanalyze,
  n_live_tup
FROM pg_stat_user_tables
WHERE relname = 'rental';

If last_analyze and last_autoanalyze are NULL, PostgreSQL has never analyzed the table. If they’re old and the table changes frequently, the planner is working with stale information.

Indexes don’t fix bad statistics. Hints don’t exist. PostgreSQL can only choose correctly if it understands your data.

Conclusion

PostgreSQL doesn’t randomly choose bad plans. It makes the best decision it can — based on the information it has.

When performance degrades, don’t blame the planner. Ask:

Are row estimates accurate?
Has the data distribution changed?
Are statistics current and detailed enough?

Once you understand how PostgreSQL thinks, query plans stop being mysterious — and start becoming something you can reason about, predict, and fix!

MichaelPatrickMcCoy

Blog Post