diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-book-stats.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-book-stats.md new file mode 100644 index 000000000..e4dd65fb1 --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-book-stats.md @@ -0,0 +1,122 @@ +--- +title: Author Book Stats +description: Practice basic aggregation with author and book data +order: 93 +type: challenge +setup: | + ```sql + CREATE TABLE author ( + id INT PRIMARY KEY, + name VARCHAR(255), + country VARCHAR(100) + ); + + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + author_id INT, + price DECIMAL(10,2), + pages INT + ); + + INSERT INTO author (id, name, country) VALUES + (1, 'Jane Smith', 'USA'), + (2, 'John Brown', 'UK'), + (3, 'Maria Garcia', 'Spain'), + (4, 'David Wilson', 'USA'); + + INSERT INTO book (id, title, author_id, price, pages) VALUES + (1, 'Database Design', 1, 29.99, 300), + (2, 'SQL Basics', 1, 24.99, 250), + (3, 'Python Programming', 2, 34.99, 400), + (4, 'Web Development', 2, 39.99, 350), + (5, 'Data Science', 2, 44.99, 450), + (6, 'Machine Learning', 3, 49.99, 500), + (7, 'AI Fundamentals', 3, 54.99, 550); + ``` +--- + +The bookstore wants to understand how many books each author has written and their average book price. They need a summary of author statistics to help with inventory planning. + +Given the following data in table `author`: + +| id | name | country | +| --- | ------------ | ------- | +| 1 | Jane Smith | USA | +| 2 | John Brown | UK | +| 3 | Maria Garcia | Spain | +| 4 | David Wilson | USA | + +And the following data in table `book`: + +| id | title | author_id | price | pages | +| --- | ----------------- | --------- | ----- | ----- | +| 1 | Database Design | 1 | 29.99 | 300 | +| 2 | SQL Basics | 1 | 24.99 | 250 | +| 3 | Python Programming| 2 | 34.99 | 400 | +| 4 | Web Development | 2 | 39.99 | 350 | +| 5 | Data Science | 2 | 44.99 | 450 | +| 6 | Machine Learning | 3 | 49.99 | 500 | +| 7 | AI Fundamentals | 3 | 54.99 | 550 | + +Write a query that shows for each author: +- Author name +- Number of books written +- Average book price +- Total pages written + +Only include authors who have written books, and order the results by number of books in descending order. + +## Expected Output + +| author_name | book_count | avg_price | total_pages | +| ------------- | ---------- | --------- | ----------- | +| John Brown | 3 | 39.99 | 1200 | +| Maria Garcia | 2 | 52.49 | 1050 | +| Jane Smith | 2 | 27.49 | 550 | + +## Solution + +```sql +SELECT + a.name as author_name, + COUNT(*) as book_count, + AVG(b.price) as avg_price, + SUM(b.pages) as total_pages +FROM author a +INNER JOIN book b ON a.id = b.author_id +GROUP BY a.name +ORDER BY book_count DESC; +``` + +### Explanation + +Let's break down how this query works: + +First, we join the author and book tables: +```sql +FROM author a +INNER JOIN book b ON a.id = b.author_id +``` + +We then calculate various aggregates for each author: +```sql +COUNT(*) -- Counts the number of books +AVG(b.price) -- Calculates average book price +SUM(b.pages) -- Sums up total pages +``` + +We group the results by author name: +```sql +GROUP BY a.name +``` + +Finally, we order by the book count: +```sql +ORDER BY book_count DESC +``` + +This query helps the bookstore understand: +- John Brown has written the most books (3) +- Maria Garcia's books have the highest average price ($52.49) +- John Brown has written the most pages (1,200) \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-tier-analysis.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-tier-analysis.md new file mode 100644 index 000000000..e00e53289 --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/author-tier-analysis.md @@ -0,0 +1,131 @@ +--- +title: Author Tier Analysis +description: Practice using CASE WHEN with aggregate functions +order: 92 +type: challenge +setup: | + ```sql + CREATE TABLE author ( + id INT PRIMARY KEY, + name VARCHAR(255) + ); + + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + author_id INT, + price DECIMAL(10,2) + ); + + INSERT INTO author (id, name) VALUES + (1, 'John Smith'), + (2, 'Emma Wilson'), + (3, 'Michael Brown'), + (4, 'Sarah Davis'), + (5, 'James Miller'); + + INSERT INTO book (id, title, author_id, price) VALUES + (1, 'SQL Basics', 1, 29.99), + (2, 'Advanced SQL', 1, 39.99), + (3, 'Database Design', 1, 44.99), + (4, 'Web Development', 2, 34.99), + (5, 'JavaScript Guide', 2, 29.99), + (6, 'Python Programming', 3, 24.99), + (7, 'Data Analysis', 4, 49.99); + ``` +--- + +The bookstore wants to categorize their authors based on how many books they've published. They want to label authors as: +- `Prolific Author` if they have written 3 or more books +- `Established Author` if they have written 2 books +- `New Author` if they have written 1 book + +Given the following data in table `author`: + +| id | name | +| --- | ------------- | +| 1 | John Smith | +| 2 | Emma Wilson | +| 3 | Michael Brown | +| 4 | Sarah Davis | +| 5 | James Miller | + +And the following data in table `book`: + +| id | title | author_id | price | +| --- | ----------------- | --------- | ----- | +| 1 | SQL Basics | 1 | 29.99 | +| 2 | Advanced SQL | 1 | 39.99 | +| 3 | Database Design | 1 | 44.99 | +| 4 | Web Development | 2 | 34.99 | +| 5 | JavaScript Guide | 2 | 29.99 | +| 6 | Python Programming| 3 | 24.99 | +| 7 | Data Analysis | 4 | 49.99 | + +Write a query that shows: +- Author name +- Number of books written +- Author tier (based on the categories above) + +Only include authors who have published at least one book, and order the results by number of books in descending order. + +## Expected Output + +| author_name | book_count | author_tier | +| ------------- | ---------- | ---------------- | +| John Smith | 3 | Prolific Author | +| Emma Wilson | 2 | Established Author| +| Michael Brown | 1 | New Author | +| Sarah Davis | 1 | New Author | + +## Solution + +```sql +SELECT + a.name as author_name, + COUNT(*) as book_count, + CASE + WHEN COUNT(*) >= 3 THEN 'Prolific Author' + WHEN COUNT(*) = 2 THEN 'Established Author' + ELSE 'New Author' + END as author_tier +FROM author a +INNER JOIN book b ON a.id = b.author_id +GROUP BY a.name +ORDER BY book_count DESC; +``` + +### Explanation + +Let's break down how this query works: + +We join the author and book tables: +```sql +FROM author a +INNER JOIN book b ON a.id = b.author_id +``` + +We count books for each author: +```sql +COUNT(*) as book_count +``` + +We use CASE WHEN to categorize authors: +```sql +CASE + WHEN COUNT(*) >= 3 THEN 'Prolific Author' + WHEN COUNT(*) = 2 THEN 'Established Author' + ELSE 'New Author' +END as author_tier +``` + +We group by author name and order by book count: +```sql +GROUP BY a.name +ORDER BY book_count DESC +``` + +This query helps the bookstore: +- Identify their most productive authors +- Categorize authors based on their output +- See the exact number of books by each author \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-performance.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-performance.md new file mode 100644 index 000000000..5c8b22fcc --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-performance.md @@ -0,0 +1,160 @@ +--- +title: Book Performance +description: Practice using aggregate functions with temporal data and joins +order: 120 +type: challenge +setup: | + ```sql + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + category VARCHAR(100), + release_date DATE + ); + + CREATE TABLE sale ( + id INT PRIMARY KEY, + book_id INT, + sale_timestamp TIMESTAMP, + quantity INT, + unit_price DECIMAL(10,2) + ); + + INSERT INTO book (id, title, category, release_date) VALUES + (1, 'The Great Gatsby', 'Fiction', '2024-01-01'), + (2, 'SQL Mastery', 'Technical', '2024-01-15'), + (3, 'Data Science 101', 'Technical', '2024-02-01'), + (4, 'Pride and Prejudice', 'Fiction', '2024-02-15'); + + INSERT INTO sale (id, book_id, sale_timestamp, quantity, unit_price) VALUES + (1, 1, '2024-02-01 10:30:00', 2, 19.99), + (2, 1, '2024-02-15 14:20:00', 1, 19.99), + (3, 2, '2024-02-01 11:15:00', 3, 29.99), + (4, 2, '2024-02-20 16:45:00', 2, 29.99), + (5, 3, '2024-02-05 09:30:00', 1, 24.99), + (6, 3, '2024-02-25 13:20:00', 2, 24.99), + (7, 4, '2024-02-15 15:45:00', 1, 14.99), + (8, 4, '2024-02-28 10:10:00', 3, 14.99); + ``` +--- + +The bookstore manager wants to analyze book performance for February 2024. They need a report showing how each book performed during that month, including details about when the book was released. + +Given the following data in table `book`: + +| id | title | category | release_date | +| --- | ------------------- | --------- | ------------ | +| 1 | The Great Gatsby | Fiction | 2024-01-01 | +| 2 | SQL Mastery | Technical | 2024-01-15 | +| 3 | Data Science 101 | Technical | 2024-02-01 | +| 4 | Pride and Prejudice | Fiction | 2024-02-15 | + +And the following data in table `sale`: + +| id | book_id | sale_timestamp | quantity | unit_price | +| --- | ------- | ------------------- | -------- | ---------- | +| 1 | 1 | 2024-02-01 10:30:00 | 2 | 19.99 | +| 2 | 1 | 2024-02-15 14:20:00 | 1 | 19.99 | +| 3 | 2 | 2024-02-01 11:15:00 | 3 | 29.99 | +| 4 | 2 | 2024-02-20 16:45:00 | 2 | 29.99 | +| 5 | 3 | 2024-02-05 09:30:00 | 1 | 24.99 | +| 6 | 3 | 2024-02-25 13:20:00 | 2 | 24.99 | +| 7 | 4 | 2024-02-15 15:45:00 | 1 | 14.99 | +| 8 | 4 | 2024-02-28 10:10:00 | 3 | 14.99 | + +Write a query that shows: + +- Book title +- Category +- Whether the book was released in February (show as 'New Release' or 'Existing') +- Total quantity sold +- Total revenue +- Number of sales transactions + +Only include books that had sales in February 2024, and order the results by total revenue in descending order. + +## Expected Output + +| title | category | release_status | total_quantity | total_revenue | sale_count | +| ------------------- | --------- | -------------- | -------------- | ------------- | ---------- | +| SQL Mastery | Technical | Existing | 5 | 149.95 | 2 | +| Data Science 101 | Technical | New Release | 3 | 74.97 | 2 | +| The Great Gatsby | Fiction | Existing | 3 | 59.97 | 2 | +| Pride and Prejudice | Fiction | New Release | 4 | 59.96 | 2 | + +## Solution + +```sql +SELECT + b.title, + b.category, + CASE + WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release' + ELSE 'Existing' + END as release_status, + SUM(s.quantity) as total_quantity, + SUM(s.quantity * s.unit_price) as total_revenue, + COUNT(*) as sale_count +FROM book b +INNER JOIN sale s ON b.id = s.book_id +WHERE + EXTRACT(MONTH FROM s.sale_timestamp) = 2 + AND EXTRACT(YEAR FROM s.sale_timestamp) = 2024 +GROUP BY + b.title, + b.category, + CASE + WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release' + ELSE 'Existing' + END +ORDER BY total_revenue DESC; +``` + +### Explanation + +Let's break down how this query works: + +First, we join the tables to get book information with sales: + +```sql +FROM book b +INNER JOIN sale s ON b.id = s.book_id +``` + +We filter for February 2024 sales: + +```sql +WHERE + EXTRACT(MONTH FROM s.sale_timestamp) = 2 + AND EXTRACT(YEAR FROM s.sale_timestamp) = 2024 +``` + +We use a CASE statement to identify new releases: + +```sql +CASE + WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release' + ELSE 'Existing' +END as release_status +``` + +We calculate aggregates for each book: + +```sql +SUM(s.quantity) as total_quantity, +SUM(s.quantity * s.unit_price) as total_revenue, +COUNT(*) as sale_count +``` + +Finally, we group by the necessary columns and order by revenue: + +```sql +GROUP BY + b.title, + b.category, + CASE + WHEN EXTRACT(MONTH FROM b.release_date) = 2 THEN 'New Release' + ELSE 'Existing' + END +ORDER BY total_revenue DESC +``` diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-sales-summary.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-sales-summary.md new file mode 100644 index 000000000..97dd308f4 --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/book-sales-summary.md @@ -0,0 +1,91 @@ +--- +title: Book Sales Summary +description: Practice using basic aggregate functions to analyze sales data +order: 90 +type: challenge +setup: | + ```sql + CREATE TABLE book_sale ( + id INT PRIMARY KEY, + title VARCHAR(255), + quantity INT, + price DECIMAL(10,2), + sale_date DATE + ); + + INSERT INTO book_sale (id, title, quantity, price, sale_date) + VALUES + (1, 'The Great Gatsby', 2, 19.99, '2024-01-15'), + (2, 'Pride and Prejudice', 1, 14.99, '2024-01-15'), + (3, '1984', 3, 12.99, '2024-01-16'), + (4, 'The Hobbit', 2, 24.99, '2024-01-16'), + (5, 'To Kill a Mockingbird', 1, 16.99, '2024-01-17'); + ``` +--- + +The bookstore owner wants a quick summary of their book sales. They need to know: +- How many sales transactions they've had +- The total number of books sold +- The average price of books sold +- The total revenue from all sales + +Given the following data in table `book_sale`: + +| id | title | quantity | price | sale_date | +| --- | ---------------------- | -------- | ----- | ---------- | +| 1 | The Great Gatsby | 2 | 19.99 | 2024-01-15 | +| 2 | Pride and Prejudice | 1 | 14.99 | 2024-01-15 | +| 3 | 1984 | 3 | 12.99 | 2024-01-16 | +| 4 | The Hobbit | 2 | 24.99 | 2024-01-16 | +| 5 | To Kill a Mockingbird | 1 | 16.99 | 2024-01-17 | + +Write a query that shows: +- Total number of sales transactions +- Total quantity of books sold +- Average price per book +- Total revenue (quantity * price) + +## Expected Output + +| total_transactions | total_books | avg_price | total_revenue | +| ----------------- | ----------- | --------- | ------------- | +| 5 | 9 | 17.99 | 161.91 | + +## Solution + +```sql +SELECT + COUNT(*) as total_transactions, + SUM(quantity) as total_books, + AVG(price) as avg_price, + SUM(quantity * price) as total_revenue +FROM book_sale; +``` + +### Explanation + +Let's break down how this query works: + +We use different aggregate functions to calculate each metric: + +```sql +COUNT(*) -- Counts the total number of rows (sales transactions) +``` + +```sql +SUM(quantity) -- Adds up all quantities to get total books sold +``` + +```sql +AVG(price) -- Calculates the average price of books +``` + +```sql +SUM(quantity * price) -- Multiplies quantity by price for each sale and adds them up +``` + +This simple query gives the bookstore owner a quick overview of their sales performance, showing: +- They've had 5 sales transactions +- Sold a total of 9 books +- The average book price is $17.99 +- Generated total revenue of $161.91 \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/category-insights.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/category-insights.md new file mode 100644 index 000000000..551487f82 --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/category-insights.md @@ -0,0 +1,123 @@ +--- +title: Category Insights +description: Practice basic aggregation with book categories +order: 91 +type: challenge +setup: | + ```sql + CREATE TABLE category ( + id INT PRIMARY KEY, + name VARCHAR(100), + display_section VARCHAR(50) + ); + + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + category_id INT, + price DECIMAL(10,2), + in_stock INT + ); + + INSERT INTO category (id, name, display_section) VALUES + (1, 'Fiction', 'Main Floor'), + (2, 'Science Fiction', 'Main Floor'), + (3, 'Technical', 'Second Floor'), + (4, 'History', 'Main Floor'); + + INSERT INTO book (id, title, category_id, price, in_stock) VALUES + (1, 'The Last Hope', 1, 19.99, 5), + (2, 'Stars Beyond', 2, 15.99, 3), + (3, 'Python Basics', 3, 29.99, 10), + (4, 'Ancient Rome', 4, 24.99, 4), + (5, 'Summer Days', 1, 14.99, 2), + (6, 'Space Wars', 2, 16.99, 3), + (7, 'JavaScript 101', 3, 27.99, 8); + ``` +--- + +The bookstore manager wants a simple overview of their book categories to help with inventory management. + +Given the following data in table `category`: + +| id | name | display_section | +| --- | --------------- | -------------- | +| 1 | Fiction | Main Floor | +| 2 | Science Fiction | Main Floor | +| 3 | Technical | Second Floor | +| 4 | History | Main Floor | + +And the following data in table `book`: + +| id | title | category_id | price | in_stock | +| --- | -------------- | ----------- | ----- | -------- | +| 1 | The Last Hope | 1 | 19.99 | 5 | +| 2 | Stars Beyond | 2 | 15.99 | 3 | +| 3 | Python Basics | 3 | 29.99 | 10 | +| 4 | Ancient Rome | 4 | 24.99 | 4 | +| 5 | Summer Days | 1 | 14.99 | 2 | +| 6 | Space Wars | 2 | 16.99 | 3 | +| 7 | JavaScript 101 | 3 | 27.99 | 8 | + +Write a query that shows for each category: +- Category name +- Number of books +- Total books in stock +- Average book price + +Order the results by the number of books in descending order. + +## Expected Output + +| category_name | book_count | total_stock | avg_price | +| -------------- | ---------- | ----------- | --------- | +| Technical | 2 | 18 | 28.99 | +| Science Fiction| 2 | 6 | 16.49 | +| Fiction | 2 | 7 | 17.49 | +| History | 1 | 4 | 24.99 | + +## Solution + +```sql +SELECT + c.name as category_name, + COUNT(*) as book_count, + SUM(b.in_stock) as total_stock, + AVG(b.price) as avg_price +FROM category c +INNER JOIN book b ON c.id = b.category_id +GROUP BY c.name +ORDER BY book_count DESC; +``` + +### Explanation + +Let's break down how this query works: + +We join the category and book tables: +```sql +FROM category c +INNER JOIN book b ON c.id = b.category_id +``` + +We calculate the aggregates for each category: +```sql +COUNT(*) -- Counts number of books +SUM(b.in_stock) -- Adds up all books in stock +AVG(b.price) -- Calculates average price +``` + +We group by category name: +```sql +GROUP BY c.name +``` + +Finally, we order by the book count: +```sql +ORDER BY book_count DESC +``` + +This query helps the manager understand: +- Which categories have the most titles +- Total inventory by category +- Average price point for each category \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/daily-sales-report.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/daily-sales-report.md new file mode 100644 index 000000000..e8e215faa --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/daily-sales-report.md @@ -0,0 +1,96 @@ +--- +title: Daily Sales Report +description: Practice using GROUP BY with aggregate functions +order: 95 +type: challenge +setup: | + ```sql + CREATE TABLE daily_sale ( + id INT PRIMARY KEY, + sale_date DATE, + book_title VARCHAR(255), + quantity INT, + price DECIMAL(10,2) + ); + + INSERT INTO daily_sale (id, sale_date, book_title, quantity, price) + VALUES + (1, '2024-01-15', 'The Great Gatsby', 2, 19.99), + (2, '2024-01-15', 'Pride and Prejudice', 1, 14.99), + (3, '2024-01-15', '1984', 3, 12.99), + (4, '2024-01-16', 'The Hobbit', 2, 24.99), + (5, '2024-01-16', 'The Great Gatsby', 1, 19.99), + (6, '2024-01-17', 'Pride and Prejudice', 2, 14.99), + (7, '2024-01-17', '1984', 1, 12.99), + (8, '2024-01-17', 'The Hobbit', 3, 24.99); + ``` +--- + +The bookstore manager wants to see how their sales are performing each day. They need a daily report showing: +- The number of transactions per day +- Total books sold per day +- Total revenue per day + +Given the following data in table `daily_sale`: + +| id | sale_date | book_title | quantity | price | +| --- | ---------- | ------------------ | -------- | ----- | +| 1 | 2024-01-15 | The Great Gatsby | 2 | 19.99 | +| 2 | 2024-01-15 | Pride and Prejudice| 1 | 14.99 | +| 3 | 2024-01-15 | 1984 | 3 | 12.99 | +| 4 | 2024-01-16 | The Hobbit | 2 | 24.99 | +| 5 | 2024-01-16 | The Great Gatsby | 1 | 19.99 | +| 6 | 2024-01-17 | Pride and Prejudice| 2 | 14.99 | +| 7 | 2024-01-17 | 1984 | 1 | 12.99 | +| 8 | 2024-01-17 | The Hobbit | 3 | 24.99 | + +Write a query that shows the daily sales metrics: +- Date +- Number of transactions that day +- Total books sold that day +- Total revenue for that day + +Order the results by date. + +## Expected Output + +| sale_date | transactions | books_sold | daily_revenue | +| ---------- | ------------ | ---------- | ------------- | +| 2024-01-15 | 3 | 6 | 84.93 | +| 2024-01-16 | 2 | 3 | 69.97 | +| 2024-01-17 | 3 | 6 | 107.94 | + +## Solution + +```sql +SELECT + sale_date, + COUNT(*) as transactions, + SUM(quantity) as books_sold, + SUM(quantity * price) as daily_revenue +FROM daily_sale +GROUP BY sale_date +ORDER BY sale_date; +``` + +### Explanation + +Let's break down how this query works: + +First, we specify the columns we want to see: +```sql +sale_date, -- The date we're grouping by +COUNT(*) as transactions, -- Count of sales for each date +SUM(quantity) as books_sold, -- Total books sold each date +SUM(quantity * price) as daily_revenue -- Total revenue each date +``` + +We group the results by date to get daily totals: +```sql +GROUP BY sale_date +``` + +Finally, we order the results by date: +```sql +ORDER BY sale_date +``` \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/employee-performance.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/employee-performance.md new file mode 100644 index 000000000..4c818942c --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/employee-performance.md @@ -0,0 +1,125 @@ +--- +title: Employee Performance +description: Practice using self joins with aggregate functions +order: 110 +type: challenge +setup: | + ```sql + CREATE TABLE employee ( + id INT PRIMARY KEY, + name VARCHAR(255), + manager_id INT, + hire_date DATE + ); + + CREATE TABLE sale ( + id INT PRIMARY KEY, + employee_id INT, + sale_date DATE, + amount DECIMAL(10,2) + ); + + INSERT INTO employee (id, name, manager_id, hire_date) VALUES + (1, 'Sarah Johnson', NULL, '2023-01-15'), -- Store Manager + (2, 'Mike Wilson', 1, '2023-02-01'), -- Reports to Sarah + (3, 'Emily Brown', 1, '2023-02-15'), -- Reports to Sarah + (4, 'Tom Davis', 2, '2023-03-01'), -- Reports to Mike + (5, 'Lisa Miller', 2, '2023-03-15'), -- Reports to Mike + (6, 'James Wilson', 3, '2023-04-01'); -- Reports to Emily + + INSERT INTO sale (id, employee_id, sale_date, amount) VALUES + (1, 2, '2024-02-01', 150.00), + (2, 2, '2024-02-02', 200.00), + (3, 3, '2024-02-01', 300.00), + (4, 4, '2024-02-02', 250.00), + (5, 4, '2024-02-03', 175.00), + (6, 5, '2024-02-01', 225.00), + (7, 5, '2024-02-02', 125.00), + (8, 6, '2024-02-03', 350.00); + ``` +--- + +The bookstore wants to analyze the sales performance of their employees and their teams. They need a report showing how each manager's team is performing. + +Given the following data in table `employee`: + +| id | name | manager_id | hire_date | +| --- | -------------- | ---------- | ---------- | +| 1 | Sarah Johnson | NULL | 2023-01-15 | +| 2 | Mike Wilson | 1 | 2023-02-01 | +| 3 | Emily Brown | 1 | 2023-02-15 | +| 4 | Tom Davis | 2 | 2023-03-01 | +| 5 | Lisa Miller | 2 | 2023-03-15 | +| 6 | James Wilson | 3 | 2023-04-01 | + +And the following data in table `sale`: + +| id | employee_id | sale_date | amount | +| --- | ----------- | ---------- | ------ | +| 1 | 2 | 2024-02-01 | 150.00 | +| 2 | 2 | 2024-02-02 | 200.00 | +| 3 | 3 | 2024-02-01 | 300.00 | +| 4 | 4 | 2024-02-02 | 250.00 | +| 5 | 4 | 2024-02-03 | 175.00 | +| 6 | 5 | 2024-02-01 | 225.00 | +| 7 | 5 | 2024-02-02 | 125.00 | +| 8 | 6 | 2024-02-03 | 350.00 | + +Write a query that shows for each manager: +- Manager name +- Number of employees they manage +- Total sales by their team (including the manager's direct sales) +- Average sale amount per team member + +Only include managers who have at least one employee reporting to them, and order the results by total team sales in descending order. + +## Expected Output + +| manager_name | team_size | total_team_sales | avg_sale_per_member | +| -------------- | --------- | ---------------- | ------------------ | +| Sarah Johnson | 5 | 1775.00 | 355.00 | +| Mike Wilson | 2 | 775.00 | 387.50 | +| Emily Brown | 1 | 350.00 | 350.00 | + +## Solution + +```sql +SELECT + m.name as manager_name, + COUNT(DISTINCT e.id) as team_size, + SUM(s.amount) as total_team_sales, + SUM(s.amount) / COUNT(DISTINCT e.id) as avg_sale_per_member +FROM employee m +INNER JOIN employee e ON e.manager_id = m.id +INNER JOIN sale s ON s.employee_id = e.id +GROUP BY m.id, m.name +ORDER BY total_team_sales DESC; +``` + +### Explanation + +Let's break down how this query works: + +First, we use a self-join to connect managers with their employees: +```sql +FROM employee m +INNER JOIN employee e ON e.manager_id = m.id +``` + +Then we join with the sales table to get sales data: +```sql +INNER JOIN sale s ON s.employee_id = e.id +``` + +We calculate various metrics for each manager: +```sql +COUNT(DISTINCT e.id) -- Counts number of team members +SUM(s.amount) -- Sums up all sales by the team +SUM(s.amount) / COUNT(DISTINCT e.id) -- Calculates average sales per team member +``` + +We group by manager and order by total sales: +```sql +GROUP BY m.id, m.name +ORDER BY total_team_sales DESC +``` \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/high-value-publishers.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/high-value-publishers.md new file mode 100644 index 000000000..7dadbd023 --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/high-value-publishers.md @@ -0,0 +1,132 @@ +--- +title: High Value Publishers +description: Practice using HAVING to filter aggregated data +order: 98 +type: challenge +setup: | + ```sql + CREATE TABLE publisher ( + id INT PRIMARY KEY, + name VARCHAR(255), + country VARCHAR(100) + ); + + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + publisher_id INT, + price DECIMAL(10,2), + stock_level INT + ); + + INSERT INTO publisher (id, name, country) VALUES + (1, 'Tech Books Inc', 'USA'), + (2, 'Global Publishing', 'UK'), + (3, 'Education Press', 'Canada'), + (4, 'Digital Media Ltd', 'Australia'), + (5, 'Research Books', 'Germany'); + + INSERT INTO book (id, title, publisher_id, price, stock_level) VALUES + (1, 'Database Design', 1, 29.99, 25), + (2, 'Advanced SQL', 1, 39.99, 15), + (3, 'Web Development', 1, 34.99, 20), + (4, 'Python Basics', 2, 24.99, 30), + (5, 'Cloud Computing', 2, 49.99, 5), + (6, 'Data Science', 3, 54.99, 20), + (7, 'Machine Learning', 3, 59.99, 12), + (8, 'Cybersecurity', 4, 45.99, 8), + (9, 'Network Basics', 5, 19.99, 10); + ``` +--- + +The bookstore wants to identify their high-value publishers based on specific criteria. They want to find publishers who: +- Have published more than 1 book +- Have an average book price above $35 +- Have more than 30 total books in stock + +Given the following data in table `publisher`: + +| id | name | country | +| --- | ----------------- | --------- | +| 1 | Tech Books Inc | USA | +| 2 | Global Publishing | UK | +| 3 | Education Press | Canada | +| 4 | Digital Media Ltd | Australia | +| 5 | Research Books | Germany | + +And the following data in table `book`: + +| id | title | publisher_id | price | stock_level | +| --- | ---------------- | ------------ | ----- | ----------- | +| 1 | Database Design | 1 | 29.99 | 25 | +| 2 | Advanced SQL | 1 | 39.99 | 15 | +| 3 | Web Development | 1 | 34.99 | 20 | +| 4 | Python Basics | 2 | 24.99 | 30 | +| 5 | Cloud Computing | 2 | 49.99 | 5 | +| 6 | Data Science | 3 | 54.99 | 20 | +| 7 | Machine Learning | 3 | 59.99 | 12 | +| 8 | Cybersecurity | 4 | 45.99 | 8 | +| 9 | Network Basics | 5 | 19.99 | 10 | + +Write a query that shows: +- Publisher name +- Number of books +- Average book price +- Total books in stock + +Only include publishers meeting ALL the criteria above, and order by average price descending. + +## Expected Output + +| publisher_name | book_count | avg_price | total_stock | +| --------------- | ---------- | --------- | ----------- | +| Education Press | 2 | 57.49 | 32 | +| Tech Books Inc | 3 | 34.99 | 60 | + +## Solution + +```sql +SELECT + p.name as publisher_name, + COUNT(*) as book_count, + AVG(b.price) as avg_price, + SUM(b.stock_level) as total_stock +FROM publisher p +INNER JOIN book b ON p.id = b.publisher_id +GROUP BY p.id, p.name +HAVING + COUNT(*) > 1 + AND AVG(b.price) > 35 + AND SUM(b.stock_level) > 30 +ORDER BY avg_price DESC; +``` + +### Explanation + +Let's break down how this query works: + +First, we join the tables: +```sql +FROM publisher p +INNER JOIN book b ON p.id = b.publisher_id +``` + +We calculate the required metrics: +```sql +COUNT(*) -- Number of books +AVG(b.price) -- Average price +SUM(b.stock_level) -- Total stock +``` + +The key part is using HAVING to filter the groups: +```sql +HAVING + COUNT(*) > 1 -- More than 1 book + AND AVG(b.price) > 35 -- Average price above $35 + AND SUM(b.stock_level) > 30 -- More than 30 total books in stock +``` + +This query helps the bookstore identify: +- Publishers with multiple books +- Publishers with premium pricing +- Publishers with significant inventory \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/premium-authors.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/premium-authors.md new file mode 100644 index 000000000..71b4ef4ea --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/premium-authors.md @@ -0,0 +1,127 @@ +--- +title: Premium Authors +description: Practice using HAVING to find high-value authors +order: 99 +type: challenge +setup: | + ```sql + CREATE TABLE author ( + id INT PRIMARY KEY, + name VARCHAR(255), + country VARCHAR(100) + ); + + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + author_id INT, + price DECIMAL(10,2), + rating DECIMAL(2,1) + ); + + INSERT INTO author (id, name, country) VALUES + (1, 'John Smith', 'USA'), + (2, 'Emma Wilson', 'UK'), + (3, 'David Chen', 'China'), + (4, 'Maria Garcia', 'Spain'), + (5, 'James Brown', 'USA'); + + INSERT INTO book (id, title, author_id, price, rating) VALUES + (1, 'SQL Mastery', 1, 45.99, 4.5), + (2, 'Database Design', 1, 49.99, 4.8), + (3, 'Python Basics', 2, 29.99, 4.2), + (4, 'Web Development', 2, 34.99, 4.0), + (5, 'Data Science', 2, 39.99, 4.6), + (6, 'Machine Learning', 3, 54.99, 4.7), + (7, 'AI Fundamentals', 3, 59.99, 4.9), + (8, 'Cloud Computing', 4, 44.99, 4.3), + (9, 'Basic Programming', 5, 24.99, 3.8); + ``` +--- + +The bookstore wants to identify their premium authors - those who consistently produce high-value, well-rated books. They want to find authors who: +- Have written at least 2 books +- Have an average book price above $40 +- Have an average rating above 4.5 + +Given the following data in table `author`: + +| id | name | country | +| --- | ------------ | ------- | +| 1 | John Smith | USA | +| 2 | Emma Wilson | UK | +| 3 | David Chen | China | +| 4 | Maria Garcia | Spain | +| 5 | James Brown | USA | + +And the following data in table `book`: + +| id | title | author_id | price | rating | +| --- | ----------------- | --------- | ----- | ------ | +| 1 | SQL Mastery | 1 | 45.99 | 4.5 | +| 2 | Database Design | 1 | 49.99 | 4.8 | +| 3 | Python Basics | 2 | 29.99 | 4.2 | +| 4 | Web Development | 2 | 34.99 | 4.0 | +| 5 | Data Science | 2 | 39.99 | 4.6 | +| 6 | Machine Learning | 3 | 54.99 | 4.7 | +| 7 | AI Fundamentals | 3 | 59.99 | 4.9 | +| 8 | Cloud Computing | 4 | 44.99 | 4.3 | +| 9 | Basic Programming | 5 | 24.99 | 3.8 | + +Write a query that shows: +- Author name +- Number of books written +- Average book price +- Average book rating + +Only include authors meeting ALL the criteria above, and order by average rating descending. + +## Expected Output + +| author_name | book_count | avg_price | avg_rating | +| ----------- | ---------- | --------- | ---------- | +| David Chen | 2 | 57.49 | 4.80 | +| John Smith | 2 | 47.99 | 4.65 | + +## Solution + +```sql +SELECT + a.name as author_name, + COUNT(*) as book_count, + AVG(b.price) as avg_price, + AVG(b.rating) as avg_rating +FROM author a +INNER JOIN book b ON a.id = b.author_id +GROUP BY a.id, a.name +HAVING + COUNT(*) >= 2 + AND AVG(b.price) > 40 + AND AVG(b.rating) > 4.5 +ORDER BY avg_rating DESC; +``` + +### Explanation + +Let's break down how this query works: + +We join the author and book tables: +```sql +FROM author a +INNER JOIN book b ON a.id = b.author_id +``` + +We calculate the metrics for each author: +```sql +COUNT(*) -- Number of books written +AVG(b.price) -- Average book price +AVG(b.rating) -- Average book rating +``` + +The key part is using HAVING to filter for premium authors: +```sql +HAVING + COUNT(*) >= 2 -- At least 2 books + AND AVG(b.price) > 40 -- Average price above $40 + AND AVG(b.rating) > 4.5 -- Average rating above 4.5 +``` \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/publisher-stats.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/publisher-stats.md new file mode 100644 index 000000000..cb5621730 --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/publisher-stats.md @@ -0,0 +1,124 @@ +--- +title: Publisher Stats +description: Practice basic aggregation with publisher data +order: 96 +type: challenge +setup: | + ```sql + CREATE TABLE publisher ( + id INT PRIMARY KEY, + name VARCHAR(255), + country VARCHAR(100) + ); + + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + publisher_id INT, + pages INT, + stock_level INT, + price DECIMAL(10,2) + ); + + INSERT INTO publisher (id, name, country) VALUES + (1, 'Tech Books Inc', 'USA'), + (2, 'Global Publishing', 'UK'), + (3, 'Education Press', 'Canada'), + (4, 'Digital Media Ltd', 'Australia'); + + INSERT INTO book (id, title, publisher_id, pages, stock_level, price) VALUES + (1, 'Database Fundamentals', 1, 300, 25, 29.99), + (2, 'Advanced SQL', 1, 400, 15, 39.99), + (3, 'Web Development', 1, 350, 0, 34.99), + (4, 'Python Mastery', 2, 450, 30, 44.99), + (5, 'Cloud Computing', 2, 375, 5, 49.99), + (6, 'Data Science', 3, 425, 20, 54.99), + (7, 'Machine Learning', 3, 500, 12, 59.99), + (8, 'Cybersecurity', 4, 350, 8, 45.99); + ``` +--- + +The bookstore wants to analyze their publishers' performance in terms of book variety, pricing, and stock levels. + +Given the following data in table `publisher`: + +| id | name | country | +| --- | ----------------- | --------- | +| 1 | Tech Books Inc | USA | +| 2 | Global Publishing | UK | +| 3 | Education Press | Canada | +| 4 | Digital Media Ltd | Australia | + +And the following data in table `book`: + +| id | title | publisher_id | pages | stock_level | price | +| --- | -------------------- | ------------ | ----- | ----------- | ----- | +| 1 | Database Fundamentals| 1 | 300 | 25 | 29.99 | +| 2 | Advanced SQL | 1 | 400 | 15 | 39.99 | +| 3 | Web Development | 1 | 350 | 0 | 34.99 | +| 4 | Python Mastery | 2 | 450 | 30 | 44.99 | +| 5 | Cloud Computing | 2 | 375 | 5 | 49.99 | +| 6 | Data Science | 3 | 425 | 20 | 54.99 | +| 7 | Machine Learning | 3 | 500 | 12 | 59.99 | +| 8 | Cybersecurity | 4 | 350 | 8 | 45.99 | + +Write a query that shows for each publisher: +- Publisher name +- Number of books published +- Total books in stock +- Average book price +- Number of out-of-stock books (`stock_level = 0`) + +Only include publishers who have published at least one book, and order the results by number of books in descending order. + +## Expected Output + +| publisher_name | book_count | total_stock | avg_price | out_of_stock | +| ---------------- | ---------- | ----------- | --------- | ------------ | +| Tech Books Inc | 3 | 40 | 34.99 | 1 | +| Global Publishing| 2 | 35 | 47.49 | 0 | +| Education Press | 2 | 32 | 57.49 | 0 | +| Digital Media Ltd| 1 | 8 | 45.99 | 0 | + +## Solution + +```sql +SELECT + p.name as publisher_name, + COUNT(*) as book_count, + SUM(b.stock_level) as total_stock, + AVG(b.price) as avg_price, + COUNT(CASE WHEN b.stock_level = 0 THEN 1 END) as out_of_stock +FROM publisher p +INNER JOIN book b ON p.id = b.publisher_id +GROUP BY p.id, p.name +ORDER BY book_count DESC; +``` + +### Explanation + +Let's break down how this query works: + +We join the publisher and book tables: +```sql +FROM publisher p +INNER JOIN book b ON p.id = b.publisher_id +``` + +We calculate various metrics for each publisher: +```sql +COUNT(*) -- Counts number of books +SUM(b.stock_level) -- Adds up all books in stock +AVG(b.price) -- Calculates average price +COUNT(CASE WHEN b.stock_level = 0 THEN 1 END) -- Counts out-of-stock books +``` + +We group by publisher: +```sql +GROUP BY p.id, p.name +``` + +Finally, we order by the number of books: +```sql +ORDER BY book_count DESC +``` \ No newline at end of file diff --git a/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/sales-analysis.md b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/sales-analysis.md new file mode 100644 index 000000000..e252a55c6 --- /dev/null +++ b/src/data/courses/sql-mastery/chapters/aggregate-functions/lessons/sales-analysis.md @@ -0,0 +1,145 @@ +--- +title: Sales Analysis +description: Practice using aggregate functions to analyze sales data +order: 100 +type: challenge +setup: | + ```sql + CREATE TABLE sale ( + id INT PRIMARY KEY, + sale_date DATE, + customer_id INT, + book_id INT, + quantity INT, + unit_price DECIMAL(10,2) + ); + + CREATE TABLE book ( + id INT PRIMARY KEY, + title VARCHAR(255), + category VARCHAR(100) + ); + + INSERT INTO book (id, title, category) + VALUES + (1, 'The Great Gatsby', 'Fiction'), + (2, 'Data Science Basics', 'Technical'), + (3, 'History of Time', 'Non-Fiction'), + (4, 'Programming 101', 'Technical'), + (5, 'Pride and Prejudice', 'Fiction'); + + INSERT INTO sale (id, sale_date, customer_id, book_id, quantity, unit_price) + VALUES + (1, '2024-01-15', 1, 1, 2, 15.99), + (2, '2024-01-15', 2, 2, 1, 29.99), + (3, '2024-01-15', 3, 3, 3, 19.99), + (4, '2024-01-16', 4, 4, 1, 24.99), + (5, '2024-01-16', 5, 5, 2, 12.99), + (6, '2024-01-16', 1, 2, 1, 29.99), + (7, '2024-01-17', 2, 3, 2, 19.99), + (8, '2024-01-17', 3, 4, 1, 24.99), + (9, '2024-01-17', 4, 5, 3, 12.99), + (10, '2024-01-18', 5, 1, 1, 15.99); + ``` +--- + +The bookstore manager wants to analyze their sales data to understand sales patterns and make inventory decisions. They need a comprehensive report that shows sales metrics by book category. + +Given the following data in table `book`: + +| id | title | category | +| --- | ------------------- | ----------- | +| 1 | The Great Gatsby | Fiction | +| 2 | Data Science Basics | Technical | +| 3 | History of Time | Non-Fiction | +| 4 | Programming 101 | Technical | +| 5 | Pride and Prejudice | Fiction | + +And the following data in table `sale`: + +| id | sale_date | customer_id | book_id | quantity | unit_price | +| --- | ---------- | ----------- | ------- | -------- | ---------- | +| 1 | 2024-01-15 | 1 | 1 | 2 | 15.99 | +| 2 | 2024-01-15 | 2 | 2 | 1 | 29.99 | +| 3 | 2024-01-15 | 3 | 3 | 3 | 19.99 | +| 4 | 2024-01-16 | 4 | 4 | 1 | 24.99 | +| 5 | 2024-01-16 | 5 | 5 | 2 | 12.99 | +| 6 | 2024-01-16 | 1 | 2 | 1 | 29.99 | +| 7 | 2024-01-17 | 2 | 3 | 2 | 19.99 | +| 8 | 2024-01-17 | 3 | 4 | 1 | 24.99 | +| 9 | 2024-01-17 | 4 | 5 | 3 | 12.99 | +| 10 | 2024-01-18 | 5 | 1 | 1 | 15.99 | + +Write a query to generate a sales report showing the following metrics for each book category: + +- Total number of sales (count of transactions) +- Total quantity of books sold +- Total revenue (quantity * unit_price) +- Average price per book +- Maximum quantity in a single transaction + +Only include categories that have generated more than $50 in total revenue, and order the results by total revenue in descending order. + +## Expected Output + +| category | total_sales | total_quantity | total_revenue | avg_price | max_quantity | +| ----------- | ----------- | -------------- | ------------- | --------- | ------------ | +| Technical | 3 | 3 | 109.96 | 27.49 | 1 | +| Fiction | 4 | 8 | 90.93 | 14.49 | 3 | +| Non-Fiction | 2 | 5 | 99.95 | 19.99 | 3 | + +## Solution + +```sql +SELECT + b.category, + COUNT(*) as total_sales, + SUM(s.quantity) as total_quantity, + SUM(s.quantity * s.unit_price) as total_revenue, + AVG(s.unit_price) as avg_price, + MAX(s.quantity) as max_quantity +FROM sale s +INNER JOIN book b ON s.book_id = b.id +GROUP BY b.category +HAVING SUM(s.quantity * s.unit_price) > 50 +ORDER BY total_revenue DESC; +``` + +### Explanation + +Let's break down how this query works: + +First, we join the `sale` and `book` tables to get category information for each sale: + +```sql +FROM sale s +INNER JOIN book b ON s.book_id = b.id +``` + +We then group the results by category to calculate aggregates for each category: + +```sql +GROUP BY b.category +``` + +We use multiple aggregate functions to calculate different metrics: + +```sql +COUNT(*) -- Counts number of sales transactions +SUM(s.quantity) -- Sums up total books sold +SUM(s.quantity * s.unit_price) -- Calculates total revenue +AVG(s.unit_price) -- Calculates average price +MAX(s.quantity) -- Finds maximum quantity in a single sale +``` + +We filter out categories with low revenue using HAVING: + +```sql +HAVING SUM(s.quantity * s.unit_price) > 50 +``` + +Finally, we order the results by total revenue in descending order: + +```sql +ORDER BY total_revenue DESC +```