SQL - GROUP BY Clause

The GROUP BY clause is used to get the summary data based on one or more groups. The groups can be formed on one or more columns. For example, the GROUP BY query will be used to count the number of employees in each department, or to get the department wise total salaries.

You must use the aggregate functions such as COUNT(), MAX(), MIN(), SUM(), AVG(), etc., in the SELECT query. The result of the GROUP BY clause returns a single row for each value of the GROUP BY column.

Syntax:

SELECT column1, column2,...columnN FROM table_name
[WHERE]
[GROUP BY column1, column2...columnN]
[HAVING]
[ORDER BY]

The SELECT clause can include columns that are used with the GROUP BY clause. So, to include other columns in the SELECT clause, use the aggregate functions like COUNT(), MAX(), MIN(), SUM(), AVG() with those columns.

GROUP BY Characteristics:
  • The GROUP BY clause is used to form the groups of records.
  • The GROUP BY clause must come after the WHERE clause if present and before the HAVING clause.
  • The GROUP BY clause can include one or more columns to form one or more groups based on that columns.
  • Only the GROUP BY columns can be included in the SELECT clause. To use other columns in the SELECT clause, use the aggregate functions with them.

For the demo purpose, we will use the following Employee and Department tables in all examples.

Employee Table
EmpId FirstName LastName Email Salary DeptId
1 'John' 'King' '[email protected]' 33000 1
2 'James' 'Bond' 1
3 'Neena' 'Kochhar' '[email protected]' 17000 2
4 'Lex' 'De Haan' '[email protected]' 15000 1
5 'Amit' 'Patel' 18000 1
6 'Abdul' 'Kalam' '[email protected]' 25000 2
Department Table
DeptId Name
1 'Finance'
2 'HR'

Consider the following GROUP BY query.

SQL Script: GROUP BY
SELECT DeptId, COUNT(EmpId) as 'Number of Employees' 
FROM Employee
GROUP BY DeptId;

--following query will return same data as above
SELECT DeptId, COUNT(*) as 'No of Employees' 
FROM Employee
GROUP BY DeptId;

The above query includes the GROUP BY DeptId clause, so you can include only DeptId in the SELECT clause. You need to use aggregate functions to include other columns in the SELECT clause, so COUNT(EmpId) is included because we want to count the number of employees in the same DeptId. The 'No of Employees' is an alias of the COUNT(EmpId) column. The query will display the following result.

DeptId No of Employees
1 4
2 2

The following query gets the department name instead of DeptId in the result.

SQL Script: GROUP BY
SELECT dept.Name as 'Department', count(emp.empid) as 'No of Employees'
FROM Employee emp, Department dept
WHERE emp.deptid = dept.DeptId
GROUP by dept.Name
Department No of Employees
Finance 4
HR 2

In the same way, the following query gets the department-wise total salaries.

SQL Script: GROUP BY
SELECT dept.Name, sum(emp.salary) as 'Total Salaries'
FROM Employee emp, Department dept
WHERE emp.deptid = dept.DeptId
GROUP by dept.Name
Department Total Salaries
Finance 66000
HR 42000

The following query would throw an error, because dept.Name is not included in the GROUP BY clause, or no aggregate function is used.

SQL Script: GROUP BY
SELECT dept.Name, sum(emp.salary) as 'Total Salaries'
FROM Employee emp, Department dept
WHERE emp.deptid = dept.DeptId
GROUP by dept.DeptId