Inner join
Combine two tables into one, keeping only the rows that match on both sides.
Overview
An inner join combines two tables into one, keeping only the rows that find a match on both sides. You reach for it whenever the answer needs columns that live in separate tables. Here are two: customers lists who they are, and orders records what they bought. Each order stores the id of the customer who placed it.
| id | name |
|---|---|
| 1 | Ana |
| 2 | Ben |
| 3 | Cara |
| id | customer_id | total |
|---|---|---|
| 101 | 1 | 40 |
| 102 | 1 | 25 |
| 103 | 2 | 60 |
Ana placed two orders. Ben placed one. Cara placed none. To list each order next to its customer, you follow the id from one table to the other. Watch what that does with Cara.
How it works
Name the two tables, then match a column on one to a column on the other. Each order’s customer_idlooks up a customer’s id, and the two rows become one. The aliases o and c let you say which table each column comes from.
The word inner is the important one. An inner join keeps a row only when it finds a partner on the other side. An order with no customer, or a customer with no order, does not appear. That is why Cara, who placed nothing, is absent from the result.
select c.name, o.id, o.total
from orders o
join customers c on o.customer_id = c.id;| id | customer_id | total |
|---|---|---|
| 101 | 1 | 40 |
| 102 | 1 | 25 |
| 103 | 2 | 60 |
| id | name |
|---|---|
| 1 | Ana |
| 2 | Ben |
| 3 | Cara |
Patterns
The same move, following a key from one table to another, covers most of what you will write.
The match is one-to-many here: Ana has two orders, so she appears twice in the result, once per order. That is correct, and it is also the trap. If you then sum a customer column, you count it once per order, not once per customer. Pitfalls covers this.
Joins compose. To bring in a third table, add another join with its own condition. Each join follows one more key.
select c.name, o.id, r.name as region
from orders o
join customers c on o.customer_id = c.id
join regions r on c.region_id = r.id;Once the rows are together, group them. To total each customer’s spend, join orders to customers and sum by customer: group by c.name with sum(o.total). The join brings the rows together; the group-by rolls them up.
Trade-offs
An inner join answers “show me the rows that go together.” Sometimes the rows that do not go together are the answer: every customer, including the ones who never ordered. An inner join cannot give you Cara, because she has no match. You will meet outer joins next; they keep the unmatched rows and fill the gaps with null. Use an inner join when you want only the matches, and a left join when a missing match is itself the point.
Pitfalls
Joining a customer to their orders repeats the customer once per order. Ana’s two orders make two Ana rows. Sum or count a customer column after that and you inflate it, because you are counting orders, not customers. When you aggregate across a join, be sure of what one row now means.
- A missing condition is a cross join. Drop the on clause and every order pairs with every customer. Three and three become nine, and the numbers look plausible.
- The drop is silent. An inner join omits unmatched rows without a word. A customer with no order, an order with a bad key: gone, and the result still looks correct.
- Match on the right key. Join on the column that actually names the relationship. Joining on the wrong column returns rows, just the wrong ones.
Performance
A join matches rows across tables, so index the columns you join on. With an index on customers.id, the database finds each order’s customer directly instead of scanning the whole table for it. On small tables it makes no visible difference; on large ones it is the difference between fast and unusable.
Practice
Recap
- An inner join combines two tables, keeping only the rows that match on both sides.
- Match the key that names the relationship, then take columns from each side.
- A one-to-many match repeats the “one” side; watch for fan-out when you aggregate.
- Unmatched rows are dropped silently. When you want them, use a left join.
- Index the join key.