PostgreSQL: Group By ROLLUP

In PostgreSQL, the ROLLUP clause is a subclause of the GROUP BY clause that is shorthand for defining multiple grouping sets. Unlike the CUBE clause, the ROLLUP does not generate all the possible grouping sets based on a specified list of columns, but just makes a subset of those.

The ROLLUP defines a hierarchy between the list of columns defined and generates all grouping sets based on that hierarchy. This is commonly used for the analysis of hierarchical data; e.g., total salary by department, gender, and company-wide total.

Syntax:
SELECT
    <column1>,
    <column2>
FROM <table_name>
GROUP BY
    ROLLUP(<column1>,<column2>...)
[ORDER BY <column_list<];

For example, the CUBE(c1,c2,c3) will generate eight possible combinations of grouping sets as shown below.

Example:
(c1,c2,c3)
(c1,c2)
(c1,c3)
(c2,c3)
(c1)
(c2)
(c3)
()

The ROLLUP(c1,c2,c3) in GROUP BY will consider column hierarchy as c1 -> c2 -> c3 and will generate four hierarchical grouping sets, as shown below.

Example: ROLLUP Grouping Sets
(c1,c2,c3)
(c1,c2)
(c1)
()

Let’s understand how ROLLUP works using the following employee table.

The following query uses the ROLLUP clause to find the total salary of employees, the total salary of employees by dept_id, and the total salary of employees (subtotal) by dept_id and gender.

Example: GROUP BY ROLLUP
SELECT dept_id, gender, SUM(salary) 
FROM employee 
GROUP BY
	ROLLUP(dept_id,gender)
ORDER BY dept_id, gender;

As you can see above, the ROLLUP clause takes hierarchy from dept_id -> gender and show grouping set for (dept_id, gender), (dept_id), and ( ).

If you change the order of columns gender and dept_id in the select statement and ROLLUP, the result will be different.

Example: ROLLUP
SELECT gender, dept_id, SUM(salary) 
FROM employee 
GROUP BY
	ROLLUP(gender,dept_id)
ORDER BY gender, dept_id;

As you can see above, now the driving column in the hierarchy is gender. Hence it shows the result of all grouping sets (gender, dept_id), (gender), and ( ).

It is also possible to do a partial rollup to reduce the number of subtotals generated.

Example: Partial ROLLUP
SELECT dept_id, gender, SUM(salary) 
FROM employee 
GROUP BY
	dept_id,
	ROLLUP(gender)
ORDER BY dept_id, gender;

The above query does partial rollup of grouping sets (dept_id, gender) and (dept_id).