Skip to main content
Back to Documentation
DocsAggregationsGROUP BY

GROUP BY

Aggregations

Concept

GROUP BY divides rows into groups that share the same values in one or more columns. Once grouped, each group is collapsed into a single summary row by applying aggregate functions like COUNT, SUM, AVG, MIN, or MAX. This is how you answer questions like "total revenue per category" or "average order value per customer".

Every non-aggregated column in your SELECT list must appear in the GROUP BY clause. This rule ensures that each output row maps to exactly one group. Violating it produces an error in standard SQL (though MySQL historically allowed it, leading to nondeterministic results).

GROUP BY is evaluated after WHERE but before HAVING and SELECT. This means WHERE filters individual rows before grouping, and HAVING filters the resulting groups. Understanding this order of operations is key to writing correct aggregation queries.

Syntax

-- Group by a single column
SELECT category, COUNT(*) AS cnt
FROM products
GROUP BY category;

-- Group by multiple columns
SELECT
  category,
  status,
  COUNT(*) AS cnt
FROM products
GROUP BY category, status;

-- Group by expression
SELECT
  DATE_TRUNC('month', order_date) AS month,
  SUM(total_amount) AS revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date);

Practical Example

Using the ecommerce schema (customers, orders, order_items, products)

-- Revenue breakdown by product category
SELECT
  p.category,
  COUNT(DISTINCT o.order_id) AS orders,
  SUM(oi.quantity)            AS units_sold,
  SUM(oi.unit_price * oi.quantity) AS revenue
FROM order_items oi
JOIN products p ON p.product_id = oi.product_id
JOIN orders o ON o.order_id = oi.order_id
GROUP BY p.category
ORDER BY revenue DESC;

-- Average order value by customer state
SELECT
  c.state,
  COUNT(o.order_id)   AS order_count,
  AVG(o.total_amount) AS avg_order_value
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.state
ORDER BY avg_order_value DESC;

Common Pitfalls & Tips

  • 1Every non-aggregated column in SELECT must be in GROUP BY. Forgetting one column causes an error (or worse, undefined behavior in MySQL's permissive mode).
  • 2GROUP BY happens before SELECT aliases are available. In standard SQL you cannot write GROUP BY alias_name — repeat the expression or use a subquery.
  • 3Grouping by high-cardinality columns (e.g., a unique ID) gives you one row per value, defeating the purpose of aggregation.
Practice GROUP BY queries