PostgreSQL: Group By CUBE
In PostgreSQL, the CUBE clause is a subclause of the GROUP BY clause. It generates multiple grouping sets that include all possible combinations of columns.
With the GROUPING SETS, you can define a set of columns that you want to group in a query. The CUBE clause performs the same operation as GROUPING SETS for all possible combinations of the specified columns.
SELECT <column1>, <column2> FROM <table_name> GROUP BY CUBE(<column_list>) [ORDER BY <column_list<];
The CUBE subclause is a short way to generate grouping sets for all the possible combinations of the specified columns. For example, the following CUBE and GROUPING SETS are equal.
-- the followings are equal CUBE(c1, c2, c3) GROUPING SETS ( (c1, c2, c3), (c1,c2), (c2,c3), (c1,c3), (c1), (c2), (c3), ( ) )
Let's understand CUBE better with some examples using the following employee table.
First, let's see the result of the GROUPING SETS clause. The following query returns
sum(salary) group by multiple grouping sets.
SELECT dept_id, gender, SUM(salary) FROM employee GROUP BY GROUPING SETS ( (dept_id, gender), (dept_id), (gender), () );
The following displays the result of the above query in pgAdmin.
In the above result, first, it returns the sum of salaries of all employees (1st row), then it returns the sum of salaries grouped by
gender (rows 2,3,4,5). Then, returns the sum of salaries grouped by
dept_id (rows 6 & 7), and finally the sum of salaries grouped by
gender (rows 8 & 9).
Now, the same result can be achieved using CUBE in the shortest way, as shown below.
SELECT dept_id, gender, SUM(salary) FROM employee GROUP BY CUBE(dept_id, gender);
The following is the result of the above query in pgAdmin.
PostgreSQL allows performing partial CUBE operation also to reduce the number of aggregate calculated. For example, if you don’t want the sum of salaries grouped by
gender then use it with the GROUP BY clause but not in the CUBE() clause, as shown below.
SELECT dept_id, gender, SUM(salary) FROM employee GROUP BY dept_id, CUBE(gender);
In the above result, it doest not calculate salaries for each
dept_id column as it is not included in the CUBE().