The SQL GROUP BY statement appears in aggregate functions. It is used to collate the data you select from a query by a particular column. You can specify multiple columns which will be grouped using the GROUP BY statement.
When you’re working with aggregate functions in SQL, it is often necessary to group rows together by common column values.
For example, say you want to get a list of branch names for your business. Beside this information you want to see the number of employees who work for those branches. You would need to use an aggregate function and group by the branch name.
That’s where the SQL GROUP BY clause comes in. In this tutorial, we are going to discuss how to use the GROUP BY clause.
Aggregate Functions Refresher
Often—when you’re working with a database—you are not looking to see the actual data within the database. Instead, you may want information about the data. For example, you may want to know the number of unique products your business sells or the maximum score on a leaderboard.
There are a number of functions built-in to SQL that allow you to get this information. These are called aggregate functions.
For example, let’s say you wanted to find out how many employees are sales associates, you could use the COUNT function. The COUNT function counts and returns the number of rows that meet a specific set of criteria. Other aggregate functions include SUM, AVG, MIN, and MAX.
If you’re looking to learn more about aggregate functions, read our SQL aggregate functions guide.
SQL GROUP BY
The SQL GROUP BY clause collates rows. GROUP BY clauses are common in queries that use aggregate functions such as MIN and MAX. The GROUP BY statement tells SQL how to aggregate the information in any non-aggregate column you have queried.
The syntax for the GROUP BY statement is:
SELECT COUNT(column1_name), column2_name FROM table1_name GROUP BY column2_name;
We have used an aggregate function in our query and specified another column.
In any query this is the case, we need to use a GROUP BY statement. The GROUP BY statement tells SQL how to display the branch data even though it is outside the aggregate function. You need to group by the table that is not in the aggregate function.
The GROUP BY clause is only used in SQL SELECT statements.
Let’s take a look at an example of the GROUP BY clause in SQL.
GROUP BY SQL Example
Let’s say we want to find the total number of employees with each title given to the workforce. In other words, we want to know how many sales associates we have, how many marketing directors we have, and so on.
We could use the following query to retrieve this information:
SELECT title, COUNT(title) FROM employees GROUP BY title;
The query returns multiple records:
title | count |
Senior Sales Associate | 1 |
Sales Associate | 4 |
Vice President of Sales | 1 |
Marketing Director | 1 |
(4 rows)
Our GROUP BY query has returned a list of unique titles held by employees. We can see the number of employees who hold that title next to each title.
When Should You Use a GROUP BY in SQL?
A GROUP BY clause is only necessary when you are looking to get more information than what is returned from the aggregate function. We discussed this a bit earlier.
If you are looking to get the number of customers you have, you only need to run a regular query. Here’s an example of a query that would return this information:
SELECT COUNT(name) FROM customers;
Our query groups the result and returns:
count |
7 |
(1 row)
If you want to know how many customers are on each of your loyalty plans, you would need to use a GROUP BY statement. Here is an example of a query that would get a list of loyalty plans and the number of customers on each plan:
SELECT loyalty_plan, COUNT(loyalty_plan) FROM customers GROUP BY loyalty_plan;
Our query collects data. Then, our query returns:
loyalty_plan | count |
Gold | 1 |
None | 3 |
Silver | 1 |
Bronze | 2 |
(4 rows)
SQL Group By Multiple Columns
If we wanted to, we could perform a GROUP BY on multiple columns. For example, say we wanted to get a list of how many employees we had at each branch with certain titles. We could retrieve this data using the following query:
"Career Karma entered my life when I needed it most and quickly helped me match with a bootcamp. Two months after graduating, I found my dream job that aligned with my values and goals in life!"
Venus, Software Engineer at Rockbot
SELECT branch, title, COUNT(title) FROM employees GROUP BY branch, title;
Our query result set shows:
branch | title | count |
Stamford | Sales Associate | 1 |
Albany | Vice President of Sales | 1 |
San Francisco | Sales Associate | 1 |
San Francisco | Senior Sales Associate | 1 |
Albany | Marketing Director | 1 |
Boston | Sales Associate | 2 |
(6 rows)
Our query creates a list of the title held by each employee. We can see the number of people who hold that title. Our data is grouped by the branch for which each employee works, and their titles.
Conclusion
The SQL GROUP BY clause is necessary in any statement where an aggregate function is used and an additional table is queried. You should group by the column not mentioned int he aggregate function.
Are you looking for a challenge? Write a statement that finds out how many employees work at each branch.
The table is called “employees” and the column in which branch names are stored is “branch”. Go back to the tutorial above and see if your query makes sense based on what we have discussed.
We have written a complete guide on how to learn SQL. This guide is suitable for beginners and those who are already on their way to mastering SQL. Check it out on our How to Learn SQL page.
About us: Career Karma is a platform designed to help job seekers find, research, and connect with job training programs to advance their careers. Learn about the CK publication.