When to Use JOIN vs. Subquery
Choosing between a JOIN and a subquery is one of the most common points of confusion for SQL developers. Both approaches let you combine data from multiple tables, and often either one will work — but the right choice can improve clarity, performance, and maintainability. The examples in this article use the Pagila sample database.
What’s the Difference?
JOINs and subqueries both let you work with data from multiple tables, but they do it in very different ways. Understanding this difference makes it much easier to choose the right tool for a given problem.
JOINA JOIN combines rows from two tables into a single result set based on a matching column (such as customer_id). Once the tables are joined, you can select any columns you need from either table.
- Best for bringing tables together side-by-side.
- Lets you work with multiple columns from both tables.
- Ideal for reporting, aggregations, and multi-table analysis.
A subquery is a query inside another query. Instead of combining tables directly, it returns a value or a set of values that the outer query uses for filtering or calculation.
- Best for computing a single value, such as a SUM or COUNT.
- Great for checking conditions like “does this exist?” or “is this in that list?”
- Keeps the main query simpler by hiding complex logic inside the subquery.
In short: JOINs combine data, while subqueries calculate or filter data. Both are useful — you just choose the one that matches what you're trying to accomplish.
Example 1: Customer Payments
Suppose you want to list every customer along with the total amount they’ve paid. There are two common ways to get this result: using a JOIN or using a subquery. Both work — but they work in different ways, and one is generally the better choice for this specific problem.
Using a JOIN
SELECT c.customer_id, c.first_name, c.last_name, SUM(p.amount) AS total_paid
FROM customer c
JOIN payment p ON p.customer_id = c.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
order by c.customer_id;
A JOIN brings the customer and payment tables together so PostgreSQL can see all payment rows that belong to each customer. Once the rows are combined, you can aggregate them with SUM() and GROUP BY.
- You get access to any other columns from payment table if you need them.
- This pattern scales well as your queries get more complex.
- It's the most common “reporting style” query in real-world SQL work.
SELECT c.customer_id, c.first_name, c.last_name,
(SELECT SUM(amount)
FROM payment p
WHERE p.customer_id = c.customer_id) AS total_paid
FROM customer c
order by c.customer_id;
A subquery runs a small query for each customer row, calculating that customer’s total payments individually. Instead of combining tables first, you ask:
“For this customer, what is their total payment amount?”
- You only get one computed value from payment table — the total.
- The outer query stays simple and readable.
- This approach is great when you don't need any additional columns from payment table.
While both queries return the same result, the JOIN version is generally the better choice for this specific problem. It follows the traditional pattern used in reporting queries, it’s more flexible if you decide to add more payment-related details later, and it typically performs well when proper indexes are in place.
- Use the JOIN when you're aggregating across tables or might need additional columns later.
- Use the subquery when you only need a single value and want to keep the outer query simple.
In practice, PostgreSQL’s optimizer may rewrite the subquery internally as a JOIN anyway. But for clarity and maintainability — especially in reporting-style queries like this — the JOIN approach is the more commonly preferred method.
Example 2: Customers Who Have Rented a Movie
In this example, the goal is simple: find which customers have ever rented a movie. This is a common pattern in SQL — checking whether a row “exists” in another table. As before, you can solve this using either a JOIN or a subquery, and each works for slightly different reasons.
Using a JOIN
SELECT DISTINCT c.customer_id, c.first_name, c.last_name
FROM customer c
JOIN rental r ON r.customer_id = c.customer_id
order by c.customer_id;
A JOIN matches each customer to their rental records. If a customer has rented at least once, that customer will appear in the result. Because customers can have multiple rentals, we use DISTINCT to remove duplicates.
- You can easily pull additional columns from rental if needed.
- The logic is easy to understand: join the tables and return matching rows.
- JOINs are efficient when the join key (customer_id) is indexed.
SELECT customer_id, first_name, last_name
FROM customer
WHERE customer_id IN (
SELECT DISTINCT customer_id FROM rental
)
order by customer_id;
Here, instead of joining tables, we ask PostgreSQL:
“Is this customer’s ID found in the rental table?”
The subquery produces a list of customer IDs who have rented something, and the outer query returns only customers whose ID appears in that list.
- Great for simple existence checks.
- Keeps the outer query clean and focused.
- You don’t need DISTINCT — PostgreSQL handles membership efficiently.
In this case, the subquery version is often the cleaner and more intention-revealing choice. We’re not trying to pull columns from rental table. We’re just asking whether a customer appears in the rental table.
- Use the subquery when your goal is simply to check existence or membership.
- Use the JOIN if you plan to include columns from the rental table, such as rental dates or number of rentals.
In practice, PostgreSQL may optimize both approaches similarly. But from a readability standpoint, the subquery clearly communicates the intent: “Return customers who exist in the rental table.” That makes it the more natural option for this specific task.
Conclusion
JOINs and subqueries often overlap, but they serve different purposes. Use JOINs when you need to combine tables and return multiple columns, and reach for a subquery when you’re calculating a value or checking whether data exists. Understanding this distinction leads to cleaner, faster, and more maintainable SQL. With practice, selecting the right tool for the job becomes effortless!