Self Join
JoinsConcept
A self join is when a table is joined to itself. This is accomplished by listing the same table twice in the FROM clause with different aliases. Self joins are essential for comparing rows within the same table — for example, finding employees and their managers when both are stored in an employees table.
The most common use case is hierarchical data. An employees table often has a manager_id column that references another row in the same table. By self-joining employees AS e with employees AS m on e.manager_id = m.employee_id, you can display each employee alongside their manager's name.
Self joins can also compare rows for analytical purposes: finding products in the same category, customers in the same city, or orders placed on the same date. The key is that the two aliases represent different roles of the same underlying table.
Syntax
-- Basic self join with aliases
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
LEFT JOIN employees m ON m.employee_id = e.manager_id;
-- Find pairs within the same group
SELECT
a.name AS product_a,
b.name AS product_b,
a.category
FROM products a
JOIN products b
ON a.category = b.category
AND a.product_id < b.product_id;Practical Example
Using the ecommerce schema (customers, orders, order_items, products)
-- Customers in the same city (find potential referral pairs)
SELECT
a.first_name || ' ' || a.last_name AS customer_a,
b.first_name || ' ' || b.last_name AS customer_b,
a.city
FROM customers a
JOIN customers b
ON a.city = b.city
AND a.customer_id < b.customer_id
ORDER BY a.city;
-- Products cheaper than others in the same category
SELECT
cheap.name AS budget_product,
cheap.price AS budget_price,
pricey.name AS premium_product,
pricey.price AS premium_price,
cheap.category
FROM products cheap
JOIN products pricey
ON cheap.category = pricey.category
AND cheap.price < pricey.price
ORDER BY cheap.category, cheap.price;Common Pitfalls & Tips
- 1Always use table aliases to avoid ambiguity. Without aliases, the database cannot tell which instance of the table you are referencing.
- 2Self joins can generate duplicate pairs (A-B and B-A). Use an inequality condition like a.id < b.id to get each pair only once.
- 3Self joins on large tables can be expensive because you are effectively doubling the data to scan. Index the join columns.