parent
f0e4b647ae
commit
188efca147
1 changed files with 284 additions and 0 deletions
@ -0,0 +1,284 @@ |
||||
--- |
||||
title: Basic Aggregation |
||||
description: Learn about fundamental SQL aggregate functions like COUNT, SUM, AVG, MIN, and MAX |
||||
order: 110 |
||||
type: lesson-challenge |
||||
setup: | |
||||
```sql |
||||
CREATE TABLE sale ( |
||||
id INTEGER PRIMARY KEY, |
||||
title TEXT, |
||||
genre TEXT, |
||||
price DECIMAL(10, 2), |
||||
quantity INTEGER, |
||||
sale_date DATE |
||||
); |
||||
|
||||
INSERT INTO sale (id, title, genre, price, quantity, sale_date) |
||||
VALUES |
||||
(1, 'The Great Gatsby', 'Fiction', 12.99, 5, '2024-01-15'), |
||||
(2, 'SQL Basics', 'Technical', 29.99, 10, '2024-01-15'), |
||||
(3, 'Pride and Prejudice', 'Fiction', 9.99, 3, '2024-02-16'), |
||||
(4, 'Data Science 101', 'Technical', 34.99, 4, '2024-02-16'), |
||||
(5, 'The Great Gatsby', 'Fiction', 12.99, 2, '2024-03-17'), |
||||
(6, 'Pride and Prejudice', 'Fiction', 9.99, 1, '2024-02-17'), |
||||
(7, 'SQL Basics', 'Technical', 29.99, 8, '2024-02-18'), |
||||
(8, 'Data Science 101', 'Technical', 34.99, NULL, '2024-02-18'); |
||||
``` |
||||
--- |
||||
|
||||
In our previous lesson, we learned what aggregation is and why it's useful. Now, let's look at some common aggregate functions. |
||||
|
||||
We will use the following `sale` table for our examples: |
||||
|
||||
| id | title | genre | price | quantity | sale_date | |
||||
| --- | ------------------- | --------- | ----- | -------- | ---------- | |
||||
| 1 | The Great Gatsby | Fiction | 12.99 | 5 | 2024-01-15 | |
||||
| 2 | SQL Basics | Technical | 29.99 | 10 | 2024-01-15 | |
||||
| 3 | Pride and Prejudice | Fiction | 9.99 | 3 | 2024-02-16 | |
||||
| 4 | Data Science 101 | Technical | 34.99 | 4 | 2024-01-16 | |
||||
| 5 | The Great Gatsby | Fiction | 12.99 | 2 | 2024-03-17 | |
||||
| 6 | Pride and Prejudice | Fiction | 9.99 | 1 | 2024-02-17 | |
||||
| 7 | SQL Basics | Technical | 29.99 | 8 | 2024-02-18 | |
||||
| 8 | Data Science 101 | Technical | 34.99 | NULL | 2024-02-18 | |
||||
|
||||
## The COUNT Function |
||||
|
||||
The `COUNT` function is used to count the number of rows in a result set. It's one of the most frequently used aggregate functions. |
||||
|
||||
There are several ways to use `COUNT`. Let's look at different examples to understand the differences. |
||||
|
||||
### Counting all rows |
||||
|
||||
`COUNT(*)` is used to count all rows in the table. For example, our query to count the total number of sales in the `sale` table will be: |
||||
|
||||
```sql |
||||
SELECT COUNT(*) |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be: |
||||
|
||||
| COUNT (\*) | |
||||
| ---------- | |
||||
| 8 | |
||||
|
||||
Notice how the result is `8`, which is the total number of rows in the `sale` table. Let's add an alias to make it more readable: |
||||
|
||||
```sql |
||||
SELECT COUNT(*) as total_sales |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be: |
||||
|
||||
| total_sales | |
||||
| ----------- | |
||||
| 8 | |
||||
|
||||
### Filtering and Counting |
||||
|
||||
We can also filter the rows before counting them. For example, we can count the number of sales for a specific genre: |
||||
|
||||
```sql |
||||
SELECT COUNT(*) as total_sales |
||||
FROM sale |
||||
WHERE genre = 'Fiction'; |
||||
``` |
||||
|
||||
The output from this query will be `4` since there are 4 rows with the genre `Fiction`. |
||||
|
||||
| total_sales | |
||||
| ----------- | |
||||
| 4 | |
||||
|
||||
> In the next lesson, we will learn about `GROUP BY` which we can use to count the number of sales for each genre. |
||||
|
||||
### Counting specific column (excludes NULL values) |
||||
|
||||
`COUNT(column)` is used to count the number of rows where the specified column is not `NULL`. For example, our query to count the number of sales with a non-NULL quantity will be: |
||||
|
||||
```sql |
||||
SELECT COUNT(quantity) as total_sales_with_quantity |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `7` since one of the rows has a `NULL` value for quantity. |
||||
|
||||
| total_sales_with_quantity | |
||||
| ------------------------- | |
||||
| 7 | |
||||
|
||||
### Count DISTINCT values |
||||
|
||||
`COUNT(DISTINCT column)` is used to count the number of unique values in a column. For example, our query to count the number of unique titles in the `sale` table will be: |
||||
|
||||
```sql |
||||
SELECT COUNT(DISTINCT title) as total_unique_titles |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `5` since there are 5 unique titles in the `sale` table. |
||||
|
||||
| total_unique_titles | |
||||
| ------------------- | |
||||
| 5 | |
||||
|
||||
Just like `COUNT(column)`, the `COUNT(DISTINCT column)` function also ignores `NULL` values. |
||||
|
||||
## The SUM Function |
||||
|
||||
The `SUM` function adds up numeric values in a column e.g. our query to calculate the total number of books sold will be: |
||||
|
||||
```sql |
||||
SELECT SUM(quantity) as total_books |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `33` since the total quantity of books sold is 33. |
||||
|
||||
| total_books | |
||||
| ----------- | |
||||
| 33 | |
||||
|
||||
We can also have expressions in the `SUM` function. For example, we can calculate the total revenue by multiplying the price and quantity for each sale and summing up the results: |
||||
|
||||
```sql |
||||
SELECT SUM(price * quantity) as total_revenue |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `810.67` since the total revenue is $810.67. |
||||
|
||||
| total_revenue | |
||||
| ------------- | |
||||
| 810.67 | |
||||
|
||||
## The AVG Function |
||||
|
||||
The `AVG` function calculates the average value of a numeric column. For example, our query to calculate the average price of books will be: |
||||
|
||||
```sql |
||||
SELECT AVG(price) as avg_price |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `21.99` since the average price of books is $21.99. |
||||
|
||||
| avg_price | |
||||
| --------- | |
||||
| 21.99 | |
||||
|
||||
We can also calculate the average quantity of books sold by using the `AVG` function. Our query to calculate the average quantity of books sold will be: |
||||
|
||||
```sql |
||||
SELECT AVG(quantity) as avg_quantity |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `4.71` since the average quantity of books sold is 4.71. |
||||
|
||||
| avg_quantity | |
||||
| ------------ | |
||||
| 4.71 | |
||||
|
||||
> `AVG` (like most aggregate functions) ignores `NULL` values. If you want to treat `NULL` values as 0, you need to use `COALESCE`: |
||||
> |
||||
> ```sql |
||||
> SELECT AVG(COALESCE(quantity, 0)) as avg_quantity |
||||
> FROM sale; |
||||
> ``` |
||||
> |
||||
> The output from this will be `4.12` instead of `4.71`. |
||||
|
||||
## The MIN and MAX Functions |
||||
|
||||
The `MIN` and `MAX` functions find the smallest and largest values in a column. For example, our query to find the cheapest and most expensive books will be: |
||||
|
||||
```sql |
||||
SELECT |
||||
MIN(price) as lowest_price, |
||||
MAX(price) as highest_price |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `9.99` and `34.99` since the lowest price is $9.99 and the highest price is $34.99. |
||||
|
||||
| lowest_price | highest_price | |
||||
| ------------ | ------------- | |
||||
| 9.99 | 34.99 | |
||||
|
||||
These functions work with dates too: |
||||
|
||||
```sql |
||||
SELECT |
||||
MIN(sale_date) as first_sale, |
||||
MAX(sale_date) as last_sale |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be `2024-01-15` and `2024-01-18` since the first sale was on `2024-01-15` and the last sale was on `2024-01-18`. |
||||
|
||||
| first_sale | last_sale | |
||||
| ---------- | ---------- | |
||||
| 2024-01-15 | 2024-01-18 | |
||||
|
||||
Filtering works with the `MIN` and `MAX` functions too. For example, our query to find cheapest and most expensive books sold in the `Fiction` genre will be: |
||||
|
||||
```sql |
||||
SELECT |
||||
MIN(price) as lowest_price, |
||||
MAX(price) as highest_price |
||||
FROM sale |
||||
WHERE genre = 'Fiction'; |
||||
``` |
||||
|
||||
The output from this query will be `9.99` and `12.99` since the lowest price is $9.99 and the highest price is $12.99. |
||||
|
||||
| lowest_price | highest_price | |
||||
| ------------ | ------------- | |
||||
| 9.99 | 12.99 | |
||||
|
||||
## Combining Aggregate Functions |
||||
|
||||
You can use multiple aggregate functions in a single query. |
||||
|
||||
For example, our query to calculate the total number of sales, total number of books sold, average price, minimum price, and maximum price in `February 2024` will be: |
||||
|
||||
```sql |
||||
SELECT |
||||
COUNT(*) as total_sales, |
||||
SUM(quantity) as total_books_sold, |
||||
AVG(price) as avg_price, |
||||
MIN(price) as min_price, |
||||
MAX(price) as max_price |
||||
FROM sale |
||||
WHERE sale_date BETWEEN '2024-02-01' AND '2024-02-29'; |
||||
``` |
||||
|
||||
The output from this query will be: |
||||
|
||||
| total_sales | total_books_sold | avg_price | min_price | max_price | |
||||
| ----------- | ---------------- | --------- | --------- | --------- | |
||||
| 5 | 16 | 23.99 | 9.99 | 34.99 | |
||||
|
||||
The query to calculate the same for all the sales will be: |
||||
|
||||
```sql |
||||
SELECT |
||||
COUNT(*) as total_sales, |
||||
SUM(quantity) as total_books_sold, |
||||
AVG(price) as avg_price, |
||||
MIN(price) as min_price, |
||||
MAX(price) as max_price |
||||
FROM sale; |
||||
``` |
||||
|
||||
The output from this query will be: |
||||
|
||||
| total_sales | total_books_sold | avg_price | min_price | max_price | |
||||
| ----------- | ---------------- | --------- | --------- | --------- | |
||||
| 8 | 33 | 21.99 | 9.99 | 34.99 | |
||||
|
||||
|
||||
In the next lesson, we'll learn about the `GROUP BY` clause and how to use it with aggregate functions to analyze data at a more granular level. |
Loading…
Reference in new issue