Add challenges for aggregate

4 weeks ago · 2794993ef3
parent cc8879c675
commit 2794993ef3
11 changed files with 1376 additions and 0 deletions
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-book-stats.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-book-stats.md
@ -0,0 +1,122 @@
 ---
 title: Author Book Stats
 description: Practice basic aggregation with author and book data
 order: 93
 type: challenge
 setup: |
  ```sql
  CREATE TABLE author (
      id INT PRIMARY KEY,
      name VARCHAR(255),
      country VARCHAR(100)
  );
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      author_id INT,
      price DECIMAL(10,2),
      pages INT
  );
  INSERT INTO author (id, name, country) VALUES
      (1, 'Jane Smith', 'USA'),
      (2, 'John Brown', 'UK'),
      (3, 'Maria Garcia', 'Spain'),
      (4, 'David Wilson', 'USA');
  INSERT INTO book (id, title, author_id, price, pages) VALUES
      (1, 'Database Design', 1, 29.99, 300),
      (2, 'SQL Basics', 1, 24.99, 250),
      (3, 'Python Programming', 2, 34.99, 400),
      (4, 'Web Development', 2, 39.99, 350),
      (5, 'Data Science', 2, 44.99, 450),
      (6, 'Machine Learning', 3, 49.99, 500),
      (7, 'AI Fundamentals', 3, 54.99, 550);
  ```
 ---
 The bookstore wants to understand how many books each author has written and their average book price. They need a summary of author statistics to help with inventory planning.
 Given the following data in table `author`:
 | id  | name         | country |
 | --- | ------------ | ------- |
 | 1   | Jane Smith   | USA     |
 | 2   | John Brown   | UK      |
 | 3   | Maria Garcia | Spain   |
 | 4   | David Wilson | USA     |
 And the following data in table `book`:
 | id  | title             | author_id | price | pages |
 | --- | ----------------- | --------- | ----- | ----- |
 | 1   | Database Design   | 1         | 29.99 | 300   |
 | 2   | SQL Basics        | 1         | 24.99 | 250   |
 | 3   | Python Programming| 2         | 34.99 | 400   |
 | 4   | Web Development   | 2         | 39.99 | 350   |
 | 5   | Data Science      | 2         | 44.99 | 450   |
 | 6   | Machine Learning  | 3         | 49.99 | 500   |
 | 7   | AI Fundamentals   | 3         | 54.99 | 550   |
 Write a query that shows for each author:
 - Author name
 - Number of books written
 - Average book price
 - Total pages written
 Only include authors who have written books, and order the results by number of books in descending order.
 ## Expected Output
 | author_name   | book_count | avg_price | total_pages |
 | ------------- | ---------- | --------- | ----------- |
 | John Brown    | 3          | 39.99     | 1200        |
 | Maria Garcia  | 2          | 52.49     | 1050        |
 | Jane Smith    | 2          | 27.49     | 550         |
 ## Solution
 ```sql
 SELECT 
    a.name as author_name,
    COUNT(*) as book_count,
    AVG(b.price) as avg_price,
    SUM(b.pages) as total_pages
 FROM author a
 INNER JOIN book b ON a.id = b.author_id
 GROUP BY a.name
 ORDER BY book_count DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 First, we join the author and book tables:
 ```sql
 FROM author a
 INNER JOIN book b ON a.id = b.author_id
 ```
 We then calculate various aggregates for each author:
 ```sql
 COUNT(*) -- Counts the number of books
 AVG(b.price) -- Calculates average book price
 SUM(b.pages) -- Sums up total pages
 ```
 We group the results by author name:
 ```sql
 GROUP BY a.name
 ```
 Finally, we order by the book count:
 ```sql
 ORDER BY book_count DESC
 ```
 This query helps the bookstore understand:
 - John Brown has written the most books (3)
 - Maria Garcia's books have the highest average price ($52.49)
 - John Brown has written the most pages (1,200) 
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-tier-analysis.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-tier-analysis.md
@ -0,0 +1,131 @@
 ---
 title: Author Tier Analysis
 description: Practice using CASE WHEN with aggregate functions
 order: 92
 type: challenge
 setup: |
  ```sql
  CREATE TABLE author (
      id INT PRIMARY KEY,
      name VARCHAR(255)
  );
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      author_id INT,
      price DECIMAL(10,2)
  );
  INSERT INTO author (id, name) VALUES
      (1, 'John Smith'),
      (2, 'Emma Wilson'),
      (3, 'Michael Brown'),
      (4, 'Sarah Davis'),
      (5, 'James Miller');
  INSERT INTO book (id, title, author_id, price) VALUES
      (1, 'SQL Basics', 1, 29.99),
      (2, 'Advanced SQL', 1, 39.99),
      (3, 'Database Design', 1, 44.99),
      (4, 'Web Development', 2, 34.99),
      (5, 'JavaScript Guide', 2, 29.99),
      (6, 'Python Programming', 3, 24.99),
      (7, 'Data Analysis', 4, 49.99);
  ```
 ---
 The bookstore wants to categorize their authors based on how many books they've published. They want to label authors as:
 - `Prolific Author` if they have written 3 or more books
 - `Established Author` if they have written 2 books
 - `New Author` if they have written 1 book
 Given the following data in table `author`:
 | id  | name          |
 | --- | ------------- |
 | 1   | John Smith    |
 | 2   | Emma Wilson   |
 | 3   | Michael Brown |
 | 4   | Sarah Davis   |
 | 5   | James Miller  |
 And the following data in table `book`:
 | id  | title             | author_id | price |
 | --- | ----------------- | --------- | ----- |
 | 1   | SQL Basics        | 1         | 29.99 |
 | 2   | Advanced SQL      | 1         | 39.99 |
 | 3   | Database Design   | 1         | 44.99 |
 | 4   | Web Development   | 2         | 34.99 |
 | 5   | JavaScript Guide  | 2         | 29.99 |
 | 6   | Python Programming| 3         | 24.99 |
 | 7   | Data Analysis     | 4         | 49.99 |
 Write a query that shows:
 - Author name
 - Number of books written
 - Author tier (based on the categories above)
 Only include authors who have published at least one book, and order the results by number of books in descending order.
 ## Expected Output
 | author_name   | book_count | author_tier      |
 | ------------- | ---------- | ---------------- |
 | John Smith    | 3          | Prolific Author  |
 | Emma Wilson   | 2          | Established Author|
 | Michael Brown | 1          | New Author       |
 | Sarah Davis   | 1          | New Author       |
 ## Solution
 ```sql
 SELECT 
    a.name as author_name,
    COUNT(*) as book_count,
    CASE 
        WHEN COUNT(*) >= 3 THEN 'Prolific Author'
        WHEN COUNT(*) = 2 THEN 'Established Author'
        ELSE 'New Author'
    END as author_tier
 FROM author a
 INNER JOIN book b ON a.id = b.author_id
 GROUP BY a.name
 ORDER BY book_count DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 We join the author and book tables:
 ```sql
 FROM author a
 INNER JOIN book b ON a.id = b.author_id
 ```
 We count books for each author:
 ```sql
 COUNT(*) as book_count
 ```
 We use CASE WHEN to categorize authors:
 ```sql
 CASE 
    WHEN COUNT(*) >= 3 THEN 'Prolific Author'
    WHEN COUNT(*) = 2 THEN 'Established Author'
    ELSE 'New Author'
 END as author_tier
 ```
 We group by author name and order by book count:
 ```sql
 GROUP BY a.name
 ORDER BY book_count DESC
 ```
 This query helps the bookstore:
 - Identify their most productive authors
 - Categorize authors based on their output
 - See the exact number of books by each author 
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-performance.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-performance.md
@ -0,0 +1,160 @@
 ---
 title: Book Performance
 description: Practice using aggregate functions with temporal data and joins
 order: 120
 type: challenge
 setup: |
  ```sql
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      category VARCHAR(100),
      release_date DATE
  );
  CREATE TABLE sale (
      id INT PRIMARY KEY,
      book_id INT,
      sale_timestamp TIMESTAMP,
      quantity INT,
      unit_price DECIMAL(10,2)
  );
  INSERT INTO book (id, title, category, release_date) VALUES
      (1, 'The Great Gatsby', 'Fiction', '2024-01-01'),
      (2, 'SQL Mastery', 'Technical', '2024-01-15'),
      (3, 'Data Science 101', 'Technical', '2024-02-01'),
      (4, 'Pride and Prejudice', 'Fiction', '2024-02-15');
  INSERT INTO sale (id, book_id, sale_timestamp, quantity, unit_price) VALUES
      (1, 1, '2024-02-01 10:30:00', 2, 19.99),
      (2, 1, '2024-02-15 14:20:00', 1, 19.99),
      (3, 2, '2024-02-01 11:15:00', 3, 29.99),
      (4, 2, '2024-02-20 16:45:00', 2, 29.99),
      (5, 3, '2024-02-05 09:30:00', 1, 24.99),
      (6, 3, '2024-02-25 13:20:00', 2, 24.99),
      (7, 4, '2024-02-15 15:45:00', 1, 14.99),
      (8, 4, '2024-02-28 10:10:00', 3, 14.99);
  ```
 ---
 The bookstore manager wants to analyze book performance for February 2024. They need a report showing how each book performed during that month, including details about when the book was released.
 Given the following data in table `book`:
 | id  | title               | category  | release_date |
 | --- | ------------------- | --------- | ------------ |
 | 1   | The Great Gatsby    | Fiction   | 2024-01-01   |
 | 2   | SQL Mastery         | Technical | 2024-01-15   |
 | 3   | Data Science 101    | Technical | 2024-02-01   |
 | 4   | Pride and Prejudice | Fiction   | 2024-02-15   |
 And the following data in table `sale`:
 | id  | book_id | sale_timestamp      | quantity | unit_price |
 | --- | ------- | ------------------- | -------- | ---------- |
 | 1   | 1       | 2024-02-01 10:30:00 | 2        | 19.99      |
 | 2   | 1       | 2024-02-15 14:20:00 | 1        | 19.99      |
 | 3   | 2       | 2024-02-01 11:15:00 | 3        | 29.99      |
 | 4   | 2       | 2024-02-20 16:45:00 | 2        | 29.99      |
 | 5   | 3       | 2024-02-05 09:30:00 | 1        | 24.99      |
 | 6   | 3       | 2024-02-25 13:20:00 | 2        | 24.99      |
 | 7   | 4       | 2024-02-15 15:45:00 | 1        | 14.99      |
 | 8   | 4       | 2024-02-28 10:10:00 | 3        | 14.99      |
 Write a query that shows:
 - Book title
 - Category
 - Whether the book was released in February (show as 'New Release' or 'Existing')
 - Total quantity sold
 - Total revenue
 - Number of sales transactions
 Only include books that had sales in February 2024, and order the results by total revenue in descending order.
 ## Expected Output
 | title               | category  | release_status | total_quantity | total_revenue | sale_count |
 | ------------------- | --------- | -------------- | -------------- | ------------- | ---------- |
 | SQL Mastery         | Technical | Existing       | 5              | 149.95        | 2          |
 | Data Science 101    | Technical | New Release    | 3              | 74.97         | 2          |
 | The Great Gatsby    | Fiction   | Existing       | 3              | 59.97         | 2          |
 | Pride and Prejudice | Fiction   | New Release    | 4              | 59.96         | 2          |
 ## Solution
 ```sql
 SELECT
    b.title,
    b.category,
    CASE
        WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release'
        ELSE 'Existing'
    END as release_status,
    SUM(s.quantity) as total_quantity,
    SUM(s.quantity * s.unit_price) as total_revenue,
    COUNT(*) as sale_count
 FROM book b
 INNER JOIN sale s ON b.id = s.book_id
 WHERE
    EXTRACT(MONTH FROM s.sale_timestamp) = 2
    AND EXTRACT(YEAR FROM s.sale_timestamp) = 2024
 GROUP BY
    b.title,
    b.category,
    CASE
        WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release'
        ELSE 'Existing'
    END
 ORDER BY total_revenue DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 First, we join the tables to get book information with sales:
 ```sql
 FROM book b
 INNER JOIN sale s ON b.id = s.book_id
 ```
 We filter for February 2024 sales:
 ```sql
 WHERE
    EXTRACT(MONTH FROM s.sale_timestamp) = 2
    AND EXTRACT(YEAR FROM s.sale_timestamp) = 2024
 ```
 We use a CASE statement to identify new releases:
 ```sql
 CASE
    WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release'
    ELSE 'Existing'
 END as release_status
 ```
 We calculate aggregates for each book:
 ```sql
 SUM(s.quantity) as total_quantity,
 SUM(s.quantity * s.unit_price) as total_revenue,
 COUNT(*) as sale_count
 ```
 Finally, we group by the necessary columns and order by revenue:
 ```sql
 GROUP BY
    b.title,
    b.category,
    CASE
        WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release'
        ELSE 'Existing'
    END
 ORDER BY total_revenue DESC
 ```
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-sales-summary.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-sales-summary.md
@ -0,0 +1,91 @@
 ---
 title: Book Sales Summary
 description: Practice using basic aggregate functions to analyze sales data
 order: 90
 type: challenge
 setup: |
  ```sql
  CREATE TABLE book_sale (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      quantity INT,
      price DECIMAL(10,2),
      sale_date DATE
  );
  INSERT INTO book_sale (id, title, quantity, price, sale_date)
  VALUES 
      (1, 'The Great Gatsby', 2, 19.99, '2024-01-15'),
      (2, 'Pride and Prejudice', 1, 14.99, '2024-01-15'),
      (3, '1984', 3, 12.99, '2024-01-16'),
      (4, 'The Hobbit', 2, 24.99, '2024-01-16'),
      (5, 'To Kill a Mockingbird', 1, 16.99, '2024-01-17');
  ```
 ---
 The bookstore owner wants a quick summary of their book sales. They need to know:
 - How many sales transactions they've had
 - The total number of books sold
 - The average price of books sold
 - The total revenue from all sales
 Given the following data in table `book_sale`:
 | id  | title                  | quantity | price | sale_date  |
 | --- | ---------------------- | -------- | ----- | ---------- |
 | 1   | The Great Gatsby      | 2        | 19.99 | 2024-01-15 |
 | 2   | Pride and Prejudice   | 1        | 14.99 | 2024-01-15 |
 | 3   | 1984                  | 3        | 12.99 | 2024-01-16 |
 | 4   | The Hobbit            | 2        | 24.99 | 2024-01-16 |
 | 5   | To Kill a Mockingbird | 1        | 16.99 | 2024-01-17 |
 Write a query that shows:
 - Total number of sales transactions
 - Total quantity of books sold
 - Average price per book
 - Total revenue (quantity * price)
 ## Expected Output
 | total_transactions | total_books | avg_price | total_revenue |
 | ----------------- | ----------- | --------- | ------------- |
 | 5                 | 9           | 17.99     | 161.91        |
 ## Solution
 ```sql
 SELECT 
    COUNT(*) as total_transactions,
    SUM(quantity) as total_books,
    AVG(price) as avg_price,
    SUM(quantity * price) as total_revenue
 FROM book_sale;
 ```
 ### Explanation
 Let's break down how this query works:
 We use different aggregate functions to calculate each metric:
 ```sql
 COUNT(*) -- Counts the total number of rows (sales transactions)
 ```
 ```sql
 SUM(quantity) -- Adds up all quantities to get total books sold
 ```
 ```sql
 AVG(price) -- Calculates the average price of books
 ```
 ```sql
 SUM(quantity * price) -- Multiplies quantity by price for each sale and adds them up
 ```
 This simple query gives the bookstore owner a quick overview of their sales performance, showing:
 - They've had 5 sales transactions
 - Sold a total of 9 books
 - The average book price is $17.99
 - Generated total revenue of $161.91 
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/category-insights.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/category-insights.md
@ -0,0 +1,123 @@
 ---
 title: Category Insights
 description: Practice basic aggregation with book categories
 order: 91
 type: challenge
 setup: |
  ```sql
  CREATE TABLE category (
      id INT PRIMARY KEY,
      name VARCHAR(100),
      display_section VARCHAR(50)
  );
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      category_id INT,
      price DECIMAL(10,2),
      in_stock INT
  );
  INSERT INTO category (id, name, display_section) VALUES
      (1, 'Fiction', 'Main Floor'),
      (2, 'Science Fiction', 'Main Floor'),
      (3, 'Technical', 'Second Floor'),
      (4, 'History', 'Main Floor');
  INSERT INTO book (id, title, category_id, price, in_stock) VALUES
      (1, 'The Last Hope', 1, 19.99, 5),
      (2, 'Stars Beyond', 2, 15.99, 3),
      (3, 'Python Basics', 3, 29.99, 10),
      (4, 'Ancient Rome', 4, 24.99, 4),
      (5, 'Summer Days', 1, 14.99, 2),
      (6, 'Space Wars', 2, 16.99, 3),
      (7, 'JavaScript 101', 3, 27.99, 8);
  ```
 ---
 The bookstore manager wants a simple overview of their book categories to help with inventory management.
 Given the following data in table `category`:
 | id  | name            | display_section |
 | --- | --------------- | -------------- |
 | 1   | Fiction         | Main Floor     |
 | 2   | Science Fiction | Main Floor     |
 | 3   | Technical       | Second Floor   |
 | 4   | History         | Main Floor     |
 And the following data in table `book`:
 | id  | title          | category_id | price | in_stock |
 | --- | -------------- | ----------- | ----- | -------- |
 | 1   | The Last Hope  | 1           | 19.99 | 5        |
 | 2   | Stars Beyond   | 2           | 15.99 | 3        |
 | 3   | Python Basics  | 3           | 29.99 | 10       |
 | 4   | Ancient Rome   | 4           | 24.99 | 4        |
 | 5   | Summer Days    | 1           | 14.99 | 2        |
 | 6   | Space Wars     | 2           | 16.99 | 3        |
 | 7   | JavaScript 101 | 3           | 27.99 | 8        |
 Write a query that shows for each category:
 - Category name
 - Number of books
 - Total books in stock
 - Average book price
 Order the results by the number of books in descending order.
 ## Expected Output
 | category_name   | book_count | total_stock | avg_price |
 | -------------- | ---------- | ----------- | --------- |
 | Technical      | 2          | 18          | 28.99     |
 | Science Fiction| 2          | 6           | 16.49     |
 | Fiction        | 2          | 7           | 17.49     |
 | History        | 1          | 4           | 24.99     |
 ## Solution
 ```sql
 SELECT 
    c.name as category_name,
    COUNT(*) as book_count,
    SUM(b.in_stock) as total_stock,
    AVG(b.price) as avg_price
 FROM category c
 INNER JOIN book b ON c.id = b.category_id
 GROUP BY c.name
 ORDER BY book_count DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 We join the category and book tables:
 ```sql
 FROM category c
 INNER JOIN book b ON c.id = b.category_id
 ```
 We calculate the aggregates for each category:
 ```sql
 COUNT(*) -- Counts number of books
 SUM(b.in_stock) -- Adds up all books in stock
 AVG(b.price) -- Calculates average price
 ```
 We group by category name:
 ```sql
 GROUP BY c.name
 ```
 Finally, we order by the book count:
 ```sql
 ORDER BY book_count DESC
 ```
 This query helps the manager understand:
 - Which categories have the most titles
 - Total inventory by category
 - Average price point for each category 
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/daily-sales-report.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/daily-sales-report.md
@ -0,0 +1,96 @@
 ---
 title: Daily Sales Report
 description: Practice using GROUP BY with aggregate functions
 order: 95
 type: challenge
 setup: |
  ```sql
  CREATE TABLE daily_sale (
      id INT PRIMARY KEY,
      sale_date DATE,
      book_title VARCHAR(255),
      quantity INT,
      price DECIMAL(10,2)
  );
  INSERT INTO daily_sale (id, sale_date, book_title, quantity, price)
  VALUES 
      (1, '2024-01-15', 'The Great Gatsby', 2, 19.99),
      (2, '2024-01-15', 'Pride and Prejudice', 1, 14.99),
      (3, '2024-01-15', '1984', 3, 12.99),
      (4, '2024-01-16', 'The Hobbit', 2, 24.99),
      (5, '2024-01-16', 'The Great Gatsby', 1, 19.99),
      (6, '2024-01-17', 'Pride and Prejudice', 2, 14.99),
      (7, '2024-01-17', '1984', 1, 12.99),
      (8, '2024-01-17', 'The Hobbit', 3, 24.99);
  ```
 ---
 The bookstore manager wants to see how their sales are performing each day. They need a daily report showing:
 - The number of transactions per day
 - Total books sold per day
 - Total revenue per day
 Given the following data in table `daily_sale`:
 | id  | sale_date  | book_title         | quantity | price |
 | --- | ---------- | ------------------ | -------- | ----- |
 | 1   | 2024-01-15 | The Great Gatsby   | 2        | 19.99 |
 | 2   | 2024-01-15 | Pride and Prejudice| 1        | 14.99 |
 | 3   | 2024-01-15 | 1984               | 3        | 12.99 |
 | 4   | 2024-01-16 | The Hobbit         | 2        | 24.99 |
 | 5   | 2024-01-16 | The Great Gatsby   | 1        | 19.99 |
 | 6   | 2024-01-17 | Pride and Prejudice| 2        | 14.99 |
 | 7   | 2024-01-17 | 1984               | 1        | 12.99 |
 | 8   | 2024-01-17 | The Hobbit         | 3        | 24.99 |
 Write a query that shows the daily sales metrics:
 - Date
 - Number of transactions that day
 - Total books sold that day
 - Total revenue for that day
 Order the results by date.
 ## Expected Output
 | sale_date  | transactions | books_sold | daily_revenue |
 | ---------- | ------------ | ---------- | ------------- |
 | 2024-01-15 | 3            | 6          | 84.93         |
 | 2024-01-16 | 2            | 3          | 69.97         |
 | 2024-01-17 | 3            | 6          | 107.94        |
 ## Solution
 ```sql
 SELECT 
    sale_date,
    COUNT(*) as transactions,
    SUM(quantity) as books_sold,
    SUM(quantity * price) as daily_revenue
 FROM daily_sale
 GROUP BY sale_date
 ORDER BY sale_date;
 ```
 ### Explanation
 Let's break down how this query works:
 First, we specify the columns we want to see:
 ```sql
 sale_date, -- The date we're grouping by
 COUNT(*) as transactions, -- Count of sales for each date
 SUM(quantity) as books_sold, -- Total books sold each date
 SUM(quantity * price) as daily_revenue -- Total revenue each date
 ```
 We group the results by date to get daily totals:
 ```sql
 GROUP BY sale_date
 ```
 Finally, we order the results by date:
 ```sql
 ORDER BY sale_date
 ```
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/employee-performance.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/employee-performance.md
@ -0,0 +1,125 @@
 ---
 title: Employee Performance
 description: Practice using self joins with aggregate functions
 order: 110
 type: challenge
 setup: |
  ```sql
  CREATE TABLE employee (
      id INT PRIMARY KEY,
      name VARCHAR(255),
      manager_id INT,
      hire_date DATE
  );
  CREATE TABLE sale (
      id INT PRIMARY KEY,
      employee_id INT,
      sale_date DATE,
      amount DECIMAL(10,2)
  );
  INSERT INTO employee (id, name, manager_id, hire_date) VALUES
      (1, 'Sarah Johnson', NULL, '2023-01-15'),    -- Store Manager
      (2, 'Mike Wilson', 1, '2023-02-01'),         -- Reports to Sarah
      (3, 'Emily Brown', 1, '2023-02-15'),         -- Reports to Sarah
      (4, 'Tom Davis', 2, '2023-03-01'),           -- Reports to Mike
      (5, 'Lisa Miller', 2, '2023-03-15'),         -- Reports to Mike
      (6, 'James Wilson', 3, '2023-04-01');        -- Reports to Emily
  INSERT INTO sale (id, employee_id, sale_date, amount) VALUES
      (1, 2, '2024-02-01', 150.00),
      (2, 2, '2024-02-02', 200.00),
      (3, 3, '2024-02-01', 300.00),
      (4, 4, '2024-02-02', 250.00),
      (5, 4, '2024-02-03', 175.00),
      (6, 5, '2024-02-01', 225.00),
      (7, 5, '2024-02-02', 125.00),
      (8, 6, '2024-02-03', 350.00);
  ```
 ---
 The bookstore wants to analyze the sales performance of their employees and their teams. They need a report showing how each manager's team is performing.
 Given the following data in table `employee`:
 | id  | name           | manager_id | hire_date  |
 | --- | -------------- | ---------- | ---------- |
 | 1   | Sarah Johnson  | NULL       | 2023-01-15 |
 | 2   | Mike Wilson    | 1          | 2023-02-01 |
 | 3   | Emily Brown    | 1          | 2023-02-15 |
 | 4   | Tom Davis      | 2          | 2023-03-01 |
 | 5   | Lisa Miller    | 2          | 2023-03-15 |
 | 6   | James Wilson   | 3          | 2023-04-01 |
 And the following data in table `sale`:
 | id  | employee_id | sale_date  | amount |
 | --- | ----------- | ---------- | ------ |
 | 1   | 2           | 2024-02-01 | 150.00 |
 | 2   | 2           | 2024-02-02 | 200.00 |
 | 3   | 3           | 2024-02-01 | 300.00 |
 | 4   | 4           | 2024-02-02 | 250.00 |
 | 5   | 4           | 2024-02-03 | 175.00 |
 | 6   | 5           | 2024-02-01 | 225.00 |
 | 7   | 5           | 2024-02-02 | 125.00 |
 | 8   | 6           | 2024-02-03 | 350.00 |
 Write a query that shows for each manager:
 - Manager name
 - Number of employees they manage
 - Total sales by their team (including the manager's direct sales)
 - Average sale amount per team member
 Only include managers who have at least one employee reporting to them, and order the results by total team sales in descending order.
 ## Expected Output
 | manager_name   | team_size | total_team_sales | avg_sale_per_member |
 | -------------- | --------- | ---------------- | ------------------ |
 | Sarah Johnson  | 5          | 1775.00          | 355.00             |
 | Mike Wilson    | 2          | 775.00           | 387.50             |
 | Emily Brown    | 1          | 350.00           | 350.00             |
 ## Solution
 ```sql
 SELECT 
    m.name as manager_name,
    COUNT(DISTINCT e.id) as team_size,
    SUM(s.amount) as total_team_sales,
    SUM(s.amount) / COUNT(DISTINCT e.id) as avg_sale_per_member
 FROM employee m
 INNER JOIN employee e ON e.manager_id = m.id
 INNER JOIN sale s ON s.employee_id = e.id
 GROUP BY m.id, m.name
 ORDER BY total_team_sales DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 First, we use a self-join to connect managers with their employees:
 ```sql
 FROM employee m
 INNER JOIN employee e ON e.manager_id = m.id
 ```
 Then we join with the sales table to get sales data:
 ```sql
 INNER JOIN sale s ON s.employee_id = e.id
 ```
 We calculate various metrics for each manager:
 ```sql
 COUNT(DISTINCT e.id) -- Counts number of team members
 SUM(s.amount) -- Sums up all sales by the team
 SUM(s.amount) / COUNT(DISTINCT e.id) -- Calculates average sales per team member
 ```
 We group by manager and order by total sales:
 ```sql
 GROUP BY m.id, m.name
 ORDER BY total_team_sales DESC
 ```
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/high-value-publishers.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/high-value-publishers.md
@ -0,0 +1,132 @@
 ---
 title: High Value Publishers
 description: Practice using HAVING to filter aggregated data
 order: 98
 type: challenge
 setup: |
  ```sql
  CREATE TABLE publisher (
      id INT PRIMARY KEY,
      name VARCHAR(255),
      country VARCHAR(100)
  );
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      publisher_id INT,
      price DECIMAL(10,2),
      stock_level INT
  );
  INSERT INTO publisher (id, name, country) VALUES
      (1, 'Tech Books Inc', 'USA'),
      (2, 'Global Publishing', 'UK'),
      (3, 'Education Press', 'Canada'),
      (4, 'Digital Media Ltd', 'Australia'),
      (5, 'Research Books', 'Germany');
  INSERT INTO book (id, title, publisher_id, price, stock_level) VALUES
      (1, 'Database Design', 1, 29.99, 25),
      (2, 'Advanced SQL', 1, 39.99, 15),
      (3, 'Web Development', 1, 34.99, 20),
      (4, 'Python Basics', 2, 24.99, 30),
      (5, 'Cloud Computing', 2, 49.99, 5),
      (6, 'Data Science', 3, 54.99, 20),
      (7, 'Machine Learning', 3, 59.99, 12),
      (8, 'Cybersecurity', 4, 45.99, 8),
      (9, 'Network Basics', 5, 19.99, 10);
  ```
 ---
 The bookstore wants to identify their high-value publishers based on specific criteria. They want to find publishers who:
 - Have published more than 1 book
 - Have an average book price above $35
 - Have more than 30 total books in stock
 Given the following data in table `publisher`:
 | id  | name              | country   |
 | --- | ----------------- | --------- |
 | 1   | Tech Books Inc    | USA       |
 | 2   | Global Publishing | UK        |
 | 3   | Education Press   | Canada    |
 | 4   | Digital Media Ltd | Australia |
 | 5   | Research Books    | Germany   |
 And the following data in table `book`:
 | id  | title            | publisher_id | price | stock_level |
 | --- | ---------------- | ------------ | ----- | ----------- |
 | 1   | Database Design  | 1            | 29.99 | 25          |
 | 2   | Advanced SQL     | 1            | 39.99 | 15          |
 | 3   | Web Development  | 1            | 34.99 | 20          |
 | 4   | Python Basics    | 2            | 24.99 | 30          |
 | 5   | Cloud Computing  | 2            | 49.99 | 5           |
 | 6   | Data Science     | 3            | 54.99 | 20          |
 | 7   | Machine Learning | 3            | 59.99 | 12          |
 | 8   | Cybersecurity    | 4            | 45.99 | 8           |
 | 9   | Network Basics   | 5            | 19.99 | 10          |
 Write a query that shows:
 - Publisher name
 - Number of books
 - Average book price
 - Total books in stock
 Only include publishers meeting ALL the criteria above, and order by average price descending.
 ## Expected Output
 | publisher_name   | book_count | avg_price | total_stock |
 | --------------- | ---------- | --------- | ----------- |
 | Education Press | 2          | 57.49     | 32          |
 | Tech Books Inc  | 3          | 34.99     | 60          |
 ## Solution
 ```sql
 SELECT 
    p.name as publisher_name,
    COUNT(*) as book_count,
    AVG(b.price) as avg_price,
    SUM(b.stock_level) as total_stock
 FROM publisher p
 INNER JOIN book b ON p.id = b.publisher_id
 GROUP BY p.id, p.name
 HAVING 
    COUNT(*) > 1 
    AND AVG(b.price) > 35 
    AND SUM(b.stock_level) > 30
 ORDER BY avg_price DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 First, we join the tables:
 ```sql
 FROM publisher p
 INNER JOIN book b ON p.id = b.publisher_id
 ```
 We calculate the required metrics:
 ```sql
 COUNT(*) -- Number of books
 AVG(b.price) -- Average price
 SUM(b.stock_level) -- Total stock
 ```
 The key part is using HAVING to filter the groups:
 ```sql
 HAVING 
    COUNT(*) > 1 -- More than 1 book
    AND AVG(b.price) > 35 -- Average price above $35
    AND SUM(b.stock_level) > 30 -- More than 30 total books in stock
 ```
 This query helps the bookstore identify:
 - Publishers with multiple books
 - Publishers with premium pricing
 - Publishers with significant inventory 
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/premium-authors.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/premium-authors.md
@ -0,0 +1,127 @@
 ---
 title: Premium Authors
 description: Practice using HAVING to find high-value authors
 order: 99
 type: challenge
 setup: |
  ```sql
  CREATE TABLE author (
      id INT PRIMARY KEY,
      name VARCHAR(255),
      country VARCHAR(100)
  );
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      author_id INT,
      price DECIMAL(10,2),
      rating DECIMAL(2,1)
  );
  INSERT INTO author (id, name, country) VALUES
      (1, 'John Smith', 'USA'),
      (2, 'Emma Wilson', 'UK'),
      (3, 'David Chen', 'China'),
      (4, 'Maria Garcia', 'Spain'),
      (5, 'James Brown', 'USA');
  INSERT INTO book (id, title, author_id, price, rating) VALUES
      (1, 'SQL Mastery', 1, 45.99, 4.5),
      (2, 'Database Design', 1, 49.99, 4.8),
      (3, 'Python Basics', 2, 29.99, 4.2),
      (4, 'Web Development', 2, 34.99, 4.0),
      (5, 'Data Science', 2, 39.99, 4.6),
      (6, 'Machine Learning', 3, 54.99, 4.7),
      (7, 'AI Fundamentals', 3, 59.99, 4.9),
      (8, 'Cloud Computing', 4, 44.99, 4.3),
      (9, 'Basic Programming', 5, 24.99, 3.8);
  ```
 ---
 The bookstore wants to identify their premium authors - those who consistently produce high-value, well-rated books. They want to find authors who:
 - Have written at least 2 books
 - Have an average book price above $40
 - Have an average rating above 4.5
 Given the following data in table `author`:
 | id  | name         | country |
 | --- | ------------ | ------- |
 | 1   | John Smith   | USA     |
 | 2   | Emma Wilson  | UK      |
 | 3   | David Chen   | China   |
 | 4   | Maria Garcia | Spain   |
 | 5   | James Brown  | USA     |
 And the following data in table `book`:
 | id  | title             | author_id | price | rating |
 | --- | ----------------- | --------- | ----- | ------ |
 | 1   | SQL Mastery       | 1         | 45.99 | 4.5    |
 | 2   | Database Design   | 1         | 49.99 | 4.8    |
 | 3   | Python Basics     | 2         | 29.99 | 4.2    |
 | 4   | Web Development   | 2         | 34.99 | 4.0    |
 | 5   | Data Science      | 2         | 39.99 | 4.6    |
 | 6   | Machine Learning  | 3         | 54.99 | 4.7    |
 | 7   | AI Fundamentals   | 3         | 59.99 | 4.9    |
 | 8   | Cloud Computing   | 4         | 44.99 | 4.3    |
 | 9   | Basic Programming | 5         | 24.99 | 3.8    |
 Write a query that shows:
 - Author name
 - Number of books written
 - Average book price
 - Average book rating
 Only include authors meeting ALL the criteria above, and order by average rating descending.
 ## Expected Output
 | author_name | book_count | avg_price | avg_rating |
 | ----------- | ---------- | --------- | ---------- |
 | David Chen  | 2          | 57.49     | 4.80       |
 | John Smith  | 2          | 47.99     | 4.65       |
 ## Solution
 ```sql
 SELECT 
    a.name as author_name,
    COUNT(*) as book_count,
    AVG(b.price) as avg_price,
    AVG(b.rating) as avg_rating
 FROM author a
 INNER JOIN book b ON a.id = b.author_id
 GROUP BY a.id, a.name
 HAVING 
    COUNT(*) >= 2 
    AND AVG(b.price) > 40 
    AND AVG(b.rating) > 4.5
 ORDER BY avg_rating DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 We join the author and book tables:
 ```sql
 FROM author a
 INNER JOIN book b ON a.id = b.author_id
 ```
 We calculate the metrics for each author:
 ```sql
 COUNT(*) -- Number of books written
 AVG(b.price) -- Average book price
 AVG(b.rating) -- Average book rating
 ```
 The key part is using HAVING to filter for premium authors:
 ```sql
 HAVING 
    COUNT(*) >= 2 -- At least 2 books
    AND AVG(b.price) > 40 -- Average price above $40
    AND AVG(b.rating) > 4.5 -- Average rating above 4.5
 ```
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/publisher-stats.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/publisher-stats.md
@ -0,0 +1,124 @@
 ---
 title: Publisher Stats
 description: Practice basic aggregation with publisher data
 order: 96
 type: challenge
 setup: |
  ```sql
  CREATE TABLE publisher (
      id INT PRIMARY KEY,
      name VARCHAR(255),
      country VARCHAR(100)
  );
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      publisher_id INT,
      pages INT,
      stock_level INT,
      price DECIMAL(10,2)
  );
  INSERT INTO publisher (id, name, country) VALUES
      (1, 'Tech Books Inc', 'USA'),
      (2, 'Global Publishing', 'UK'),
      (3, 'Education Press', 'Canada'),
      (4, 'Digital Media Ltd', 'Australia');
  INSERT INTO book (id, title, publisher_id, pages, stock_level, price) VALUES
      (1, 'Database Fundamentals', 1, 300, 25, 29.99),
      (2, 'Advanced SQL', 1, 400, 15, 39.99),
      (3, 'Web Development', 1, 350, 0, 34.99),
      (4, 'Python Mastery', 2, 450, 30, 44.99),
      (5, 'Cloud Computing', 2, 375, 5, 49.99),
      (6, 'Data Science', 3, 425, 20, 54.99),
      (7, 'Machine Learning', 3, 500, 12, 59.99),
      (8, 'Cybersecurity', 4, 350, 8, 45.99);
  ```
 ---
 The bookstore wants to analyze their publishers' performance in terms of book variety, pricing, and stock levels.
 Given the following data in table `publisher`:
 | id  | name              | country   |
 | --- | ----------------- | --------- |
 | 1   | Tech Books Inc    | USA       |
 | 2   | Global Publishing | UK        |
 | 3   | Education Press   | Canada    |
 | 4   | Digital Media Ltd | Australia |
 And the following data in table `book`:
 | id  | title                | publisher_id | pages | stock_level | price |
 | --- | -------------------- | ------------ | ----- | ----------- | ----- |
 | 1   | Database Fundamentals| 1            | 300   | 25          | 29.99 |
 | 2   | Advanced SQL         | 1            | 400   | 15          | 39.99 |
 | 3   | Web Development      | 1            | 350   | 0           | 34.99 |
 | 4   | Python Mastery       | 2            | 450   | 30          | 44.99 |
 | 5   | Cloud Computing      | 2            | 375   | 5           | 49.99 |
 | 6   | Data Science         | 3            | 425   | 20          | 54.99 |
 | 7   | Machine Learning     | 3            | 500   | 12          | 59.99 |
 | 8   | Cybersecurity        | 4            | 350   | 8           | 45.99 |
 Write a query that shows for each publisher:
 - Publisher name
 - Number of books published
 - Total books in stock
 - Average book price
 - Number of out-of-stock books (`stock_level = 0`)
 Only include publishers who have published at least one book, and order the results by number of books in descending order.
 ## Expected Output
 | publisher_name    | book_count | total_stock | avg_price | out_of_stock |
 | ---------------- | ---------- | ----------- | --------- | ------------ |
 | Tech Books Inc   | 3          | 40          | 34.99     | 1            |
 | Global Publishing| 2          | 35          | 47.49     | 0            |
 | Education Press  | 2          | 32          | 57.49     | 0            |
 | Digital Media Ltd| 1          | 8           | 45.99     | 0            |
 ## Solution
 ```sql
 SELECT 
    p.name as publisher_name,
    COUNT(*) as book_count,
    SUM(b.stock_level) as total_stock,
    AVG(b.price) as avg_price,
    COUNT(CASE WHEN b.stock_level = 0 THEN 1 END) as out_of_stock
 FROM publisher p
 INNER JOIN book b ON p.id = b.publisher_id
 GROUP BY p.id, p.name
 ORDER BY book_count DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 We join the publisher and book tables:
 ```sql
 FROM publisher p
 INNER JOIN book b ON p.id = b.publisher_id
 ```
 We calculate various metrics for each publisher:
 ```sql
 COUNT(*) -- Counts number of books
 SUM(b.stock_level) -- Adds up all books in stock
 AVG(b.price) -- Calculates average price
 COUNT(CASE WHEN b.stock_level = 0 THEN 1 END) -- Counts out-of-stock books
 ```
 We group by publisher:
 ```sql
 GROUP BY p.id, p.name
 ```
 Finally, we order by the number of books:
 ```sql
 ORDER BY book_count DESC
 ```
--- a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/sales-analysis.md
+++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/sales-analysis.md
@ -0,0 +1,145 @@
 ---
 title: Sales Analysis
 description: Practice using aggregate functions to analyze sales data
 order: 100
 type: challenge
 setup: |
  ```sql
  CREATE TABLE sale (
      id INT PRIMARY KEY,
      sale_date DATE,
      customer_id INT,
      book_id INT,
      quantity INT,
      unit_price DECIMAL(10,2)
  );
  CREATE TABLE book (
      id INT PRIMARY KEY,
      title VARCHAR(255),
      category VARCHAR(100)
  );
  INSERT INTO book (id, title, category)
  VALUES 
      (1, 'The Great Gatsby', 'Fiction'),
      (2, 'Data Science Basics', 'Technical'),
      (3, 'History of Time', 'Non-Fiction'),
      (4, 'Programming 101', 'Technical'),
      (5, 'Pride and Prejudice', 'Fiction');
  INSERT INTO sale (id, sale_date, customer_id, book_id, quantity, unit_price)
  VALUES 
      (1, '2024-01-15', 1, 1, 2, 15.99),
      (2, '2024-01-15', 2, 2, 1, 29.99),
      (3, '2024-01-15', 3, 3, 3, 19.99),
      (4, '2024-01-16', 4, 4, 1, 24.99),
      (5, '2024-01-16', 5, 5, 2, 12.99),
      (6, '2024-01-16', 1, 2, 1, 29.99),
      (7, '2024-01-17', 2, 3, 2, 19.99),
      (8, '2024-01-17', 3, 4, 1, 24.99),
      (9, '2024-01-17', 4, 5, 3, 12.99),
      (10, '2024-01-18', 5, 1, 1, 15.99);
  ```
 ---
 The bookstore manager wants to analyze their sales data to understand sales patterns and make inventory decisions. They need a comprehensive report that shows sales metrics by book category.
 Given the following data in table `book`:
 | id  | title               | category    |
 | --- | ------------------- | ----------- |
 | 1   | The Great Gatsby    | Fiction     |
 | 2   | Data Science Basics | Technical   |
 | 3   | History of Time     | Non-Fiction |
 | 4   | Programming 101     | Technical   |
 | 5   | Pride and Prejudice | Fiction     |
 And the following data in table `sale`:
 | id  | sale_date  | customer_id | book_id | quantity | unit_price |
 | --- | ---------- | ----------- | ------- | -------- | ---------- |
 | 1   | 2024-01-15 | 1           | 1       | 2        | 15.99      |
 | 2   | 2024-01-15 | 2           | 2       | 1        | 29.99      |
 | 3   | 2024-01-15 | 3           | 3       | 3        | 19.99      |
 | 4   | 2024-01-16 | 4           | 4       | 1        | 24.99      |
 | 5   | 2024-01-16 | 5           | 5       | 2        | 12.99      |
 | 6   | 2024-01-16 | 1           | 2       | 1        | 29.99      |
 | 7   | 2024-01-17 | 2           | 3       | 2        | 19.99      |
 | 8   | 2024-01-17 | 3           | 4       | 1        | 24.99      |
 | 9   | 2024-01-17 | 4           | 5       | 3        | 12.99      |
 | 10  | 2024-01-18 | 5           | 1       | 1        | 15.99      |
 Write a query to generate a sales report showing the following metrics for each book category:
 - Total number of sales (count of transactions)
 - Total quantity of books sold
 - Total revenue (quantity * unit_price)
 - Average price per book
 - Maximum quantity in a single transaction
 Only include categories that have generated more than $50 in total revenue, and order the results by total revenue in descending order.
 ## Expected Output
 | category    | total_sales | total_quantity | total_revenue | avg_price | max_quantity |
 | ----------- | ----------- | -------------- | ------------- | --------- | ------------ |
 | Technical   | 3           | 3              | 109.96        | 27.49     | 1            |
 | Fiction     | 4           | 8              | 90.93         | 14.49     | 3            |
 | Non-Fiction | 2           | 5              | 99.95         | 19.99     | 3            |
 ## Solution
 ```sql
 SELECT 
    b.category,
    COUNT(*) as total_sales,
    SUM(s.quantity) as total_quantity,
    SUM(s.quantity * s.unit_price) as total_revenue,
    AVG(s.unit_price) as avg_price,
    MAX(s.quantity) as max_quantity
 FROM sale s
 INNER JOIN book b ON s.book_id = b.id
 GROUP BY b.category
 HAVING SUM(s.quantity * s.unit_price) > 50
 ORDER BY total_revenue DESC;
 ```
 ### Explanation
 Let's break down how this query works:
 First, we join the `sale` and `book` tables to get category information for each sale:
 ```sql
 FROM sale s
 INNER JOIN book b ON s.book_id = b.id
 ```
 We then group the results by category to calculate aggregates for each category:
 ```sql
 GROUP BY b.category
 ```
 We use multiple aggregate functions to calculate different metrics:
 ```sql
 COUNT(*) -- Counts number of sales transactions
 SUM(s.quantity) -- Sums up total books sold
 SUM(s.quantity * s.unit_price) -- Calculates total revenue
 AVG(s.unit_price) -- Calculates average price
 MAX(s.quantity) -- Finds maximum quantity in a single sale
 ```
 We filter out categories with low revenue using HAVING:
 ```sql
 HAVING SUM(s.quantity * s.unit_price) > 50
 ```
 Finally, we order the results by total revenue in descending order:
 ```sql
 ORDER BY total_revenue DESC
 ```