PostgreSQL Indexes Explained: How to Choose the Right One
PostgreSQL supports multiple index types, each built for different query patterns and data distributions. Choosing the right one can dramatically improve read performance, reduce write overhead, and optimize storage. All examples below use the Pagila Sample Database so you can run every command yourself.
What An Index Actually Is
An index is a separate data structure—often a tree or hash—that lets PostgreSQL find rows quickly without scanning the entire table. Indexes make SELECT queries faster, but add overhead to INSERT, UPDATE, and DELETE.
B-Tree Indexes (Default)
B-Tree indexes are PostgreSQL’s default and the most widely used. They support equality and range filtering and are ideal for most queries in Pagila.
-- Finding a customer by email
CREATE INDEX idx_customer_email ON customer(email);
Here are some queries from the Pagila database that will benefit from this index:
-- Exact match email lookup
SELECT customer_id, first_name, last_name
FROM customer
WHERE email = 'MARY.SMITH@sakilacustomer.org';
-- Range query on email
SELECT customer_id, email
FROM customer
WHERE email >= 'M'
ORDER BY email;
-- Sorting on the indexed column
SELECT customer_id, email
FROM customer
ORDER BY email
LIMIT 20;
Use B-Tree indexes when:
- Filtering with =, <, >, <=, >=
- Sorting rows (ORDER BY)
- Enforcing uniqueness
Hash Indexes
Hash indexes support only equality comparisons. While WAL-logged and reliable in modern PostgreSQL, B-Tree indexes are typically still superior — except in narrow cases involving large text equality lookups.
-- Equality-only lookup on a text column
CREATE INDEX idx_actor_last_name_hash ON actor USING hash(last_name);
Here are some Pagila queries that benefit from this hash index:
-- Fast equality match on last_name
SELECT actor_id, first_name, last_name
FROM actor
WHERE last_name = 'DAVIS';
-- Another equality filter example
SELECT actor_id, first_name, last_name
FROM actor
WHERE last_name = 'HOPKINS';
-- Checking how many actors share the same last name
SELECT COUNT(*)
FROM actor
WHERE last_name = 'LEE';
Use Hash indexes when:
- You only query with equality
- You have very large text values
GIN Indexes
GIN indexes excel at multi-value data such as arrays, JSONB, and full-text search—common in richer schemas but still useful to demonstrate using Pagila.
-- Full-text search index on film descriptions
CREATE INDEX idx_film_description_gin
ON film USING gin(to_tsvector('english', description));
Here are some Pagila queries that benefit from this GIN full-text index:
-- Search for films that reference a "robot"
SELECT film_id, title
FROM film
WHERE to_tsvector('english', description) @@ to_tsquery('robot');
-- Search for action-themed films
SELECT film_id, title
FROM film
WHERE to_tsvector('english', description) @@ to_tsquery('fight | battle');
-- Search for films mentioning "boat" or "ship"
SELECT film_id, title
FROM film
WHERE to_tsvector('english', description) @@ to_tsquery('boat | ship');
Use GIN indexes when:
- Querying JSONB fields
- Searching arrays
- Using full-text search
GiST Indexes
GiST indexes power more complex data types such as geometric values, ranges, and PostGIS spatial data. Pagila doesn’t include geometric columns, but you can still create small examples.
-- Enable the cube extension (ships with PostgreSQL)
CREATE EXTENSION IF NOT EXISTS cube;
-- Create a small demo table linked to Pagila films
CREATE TABLE film_ratings (
film_id INT REFERENCES film(film_id),
rating_vector cube
);
-- Create a GiST index for similarity / nearest-neighbor search
CREATE INDEX idx_film_ratings_gist
ON film_ratings USING gist(rating_vector);
-- Insert some sample vectors so our queries return real results
-- These represent a hypothetical 2D rating score for each film
INSERT INTO film_ratings (film_id, rating_vector)
SELECT film_id, cube(ROUND(RANDOM()*5, 2), ROUND(RANDOM()*5, 2))
FROM film
LIMIT 50; -- insert ratings for 50 random films
Here are some similarity-search queries that benefit from the GiST index, the same kind of logic used in recommendation engines.
-- 1. Find films with a rating vector near a target vector
-- (example: two-dimensional rating metrics)
SELECT f.title, fr.rating_vector
FROM film_ratings fr
JOIN film f ON f.film_id = fr.film_id
ORDER BY fr.rating_vector <-> cube(3, 4) -- nearest to (3,4)
LIMIT 10;
-- 2. Find films within a certain distance of (3, 4)
SELECT f.title, fr.rating_vector
FROM film_ratings fr
JOIN film f ON f.film_id = fr.film_id
WHERE fr.rating_vector <-> cube(3, 4) < 2.0;
Use GiST indexes when:
- You work with geometric or spatial data
- You need nearest-neighbor searches
- You use range types, PostGIS data, or multidimensional values
BRIN Indexes
BRIN indexes are tiny and ideal for very large, naturally ordered tables. Pagila’s rental table is perfect since rental_date grows sequentially.
-- Efficient index for large, append-only tables
CREATE INDEX idx_rental_date_brin
ON rental USING brin(rental_date);
Here are some Pagila queries that take advantage of a BRIN index:
-- 1. Find rentals within a recent date range
SELECT rental_id, customer_id, rental_date
FROM rental
WHERE rental_date >= '2022-05-24'
AND rental_date < '2022-05-25';
-- 2. Look up all rentals on a specific day
SELECT rental_id, customer_id, inventory_id
FROM rental
WHERE rental_date::date = '2022-05-26';
-- 3. Fetch the most recent rentals (append-most data)
SELECT rental_id, customer_id, rental_date
FROM rental
ORDER BY rental_date DESC
LIMIT 20;
Use BRIN indexes when:
- Your table is large and mostly append-only
- Values correlate with physical order (timestamps, IDs)
- You want tiny index size
Partial Indexes
Partial indexes store only the rows that match a specific WHERE clause. This is extremely useful when you frequently query a predictable subset of a table. In the Pagila database, most customers are active, so creating an index only on inactive customers is a perfect example of reducing index size while still improving performance.
-- Index only inactive customers (active = 0)
CREATE INDEX idx_inactive_customers
ON customer(last_update)
WHERE active = 0;
Here are some Pagila queries that benefit from this partial index:
-- 1. Find all inactive customers
SELECT customer_id, first_name, last_name, active
FROM customer
WHERE active = 0;
-- 2. Recently updated inactive customers
SELECT customer_id, first_name, last_name, last_update
FROM customer
WHERE active = 0
ORDER BY last_update DESC
LIMIT 20;
-- 3. Count inactive customers
SELECT COUNT(*)
FROM customer
WHERE active = 0;
Use Partial indexes when:
- You frequently query a predictable subset of rows
- You want to reduce index size
- Your filters are consistent (e.g., active/inactive, open/closed, recent/archive)
Expression Indexes
Expression indexes let you index the result of a function or calculation instead of a raw column. This is extremely useful when your queries apply functions in the WHERE clause. In the Pagila database, a common example is performing case-insensitive searches on customer emails. Instead of storing a separate lowercase column, you can index lower(email) directly.
-- Case-insensitive email search
CREATE INDEX idx_customer_email_lower
ON customer((lower(email)));
Here are some Pagila queries that benefit from this expression index:
-- 1. Case-insensitive lookup (works regardless of capitalization)
SELECT customer_id, first_name, last_name, email
FROM customer
WHERE lower(email) = lower('MARY.SMITH@sakilacustomer.org');
-- 2. Search for all customers with email ending in a domain
SELECT customer_id, first_name, last_name, email
FROM customer
WHERE lower(email) LIKE '%@sakilacustomer.org';
-- 3. Find emails starting with a given prefix (case-insensitive)
SELECT customer_id, first_name, last_name, email
FROM customer
WHERE lower(email) LIKE 'a%';
Use Expression indexes when:
- Your queries apply functions such as lower(), upper(), date_trunc(), COALESCE(), or math expressions
- You need case-insensitive or function-based filtering
- You want indexing without storing redundant computed columns
Unique Indexes
Unique indexes ensure that no two rows contain the same value in the indexed column(s). They are used both for enforcing data integrity and for improving query performance. In the Pagila database, a good example is ensuring that each store has exactly one manager.
-- Enforce one unique manager per store
CREATE UNIQUE INDEX idx_store_manager
ON store(manager_staff_id);
Here are some Pagila queries that make use of this uniqueness:
-- 1. Lookup a store by its manager (fast due to unique index)
SELECT store_id, manager_staff_id, address_id
FROM store
WHERE manager_staff_id = 1;
-- 2. Ensure a manager does not manage multiple stores
SELECT manager_staff_id, COUNT(*)
FROM store
GROUP BY manager_staff_id
HAVING COUNT(*) > 1;
-- 3. Join staff to store using the unique manager key
SELECT s.staff_id, s.first_name, s.last_name, st.store_id
FROM staff s
JOIN store st ON st.manager_staff_id = s.staff_id;
Use Unique indexes when:
- You need to enforce data integrity (e.g., unique email, one manager per store)
- You want fast lookups on highly selective columns
- You need to enforce business rules requiring uniqueness
Conclusion
Most Pagila queries benefit from straightforward B-Tree indexes. But as your workload grows, specialized indexes like GIN, GiST, BRIN, partial, and expression indexes solve problems B-Tree can’t handle efficiently. PostgreSQL offers powerful index types designed for different workloads. Using the Pagila database makes it easy to experiment with each one in a real schema. Understanding your query patterns and matching them to the correct index type is one of the fastest ways to improve performance and scale your application. When in doubt, start with B-Tree and expand to specialized indexes as needed!