From 8cbc10c358d272e6e58dcdafea50c34d8291c567 Mon Sep 17 00:00:00 2001 From: Kamran Ahmed Date: Wed, 1 Jan 2025 17:04:32 +0000 Subject: [PATCH] Add foreign key constraint lesson --- .../lessons/check-constraints.md | 4 +- .../lessons/foreign-key-constraint.md | 474 +++++++++++++----- 2 files changed, 342 insertions(+), 136 deletions(-) diff --git a/src/data/courses/sql-mastery/chapters/defining-tables/lessons/check-constraints.md b/src/data/courses/sql-mastery/chapters/defining-tables/lessons/check-constraints.md index 2565ac70c..5a6645598 100644 --- a/src/data/courses/sql-mastery/chapters/defining-tables/lessons/check-constraints.md +++ b/src/data/courses/sql-mastery/chapters/defining-tables/lessons/check-constraints.md @@ -92,8 +92,8 @@ CREATE TABLE book_reviews ( rating INTEGER, review_status TEXT, - -- Named CHECK constraints - -- [CONSTRAINT name] CHECK (condition) + -- Named CHECK constraint Syntax + -- CONSTRAINT [CONSTRAINT name] CHECK (condition) CONSTRAINT valid_rating CHECK (rating >= 1 AND rating <= 5), CONSTRAINT valid_status CHECK ( review_status IN ('pending', 'approved', 'rejected') diff --git a/src/data/courses/sql-mastery/chapters/multi-table-queries/lessons/foreign-key-constraint.md b/src/data/courses/sql-mastery/chapters/multi-table-queries/lessons/foreign-key-constraint.md index 72bf6ab91..dfad3d6b1 100644 --- a/src/data/courses/sql-mastery/chapters/multi-table-queries/lessons/foreign-key-constraint.md +++ b/src/data/courses/sql-mastery/chapters/multi-table-queries/lessons/foreign-key-constraint.md @@ -3,43 +3,129 @@ title: Foreign Key Constraint description: Learn how foreign keys are used to enforce relationships between tables order: 130 type: lesson-challenge +setup: | + ```sql + CREATE TABLE author ( + id INT PRIMARY KEY, + name VARCHAR(255) + ); + + CREATE TABLE author_biography ( + id INT PRIMARY KEY, + author_id INT, + biography TEXT + ); + ``` --- -In our previous lessons, we learned about relationships between tables and how they are achieved using primary and foreign keys. +So far in this course, we have learned about primary keys, relationships and their types, and queries involving multiple tables. I have intentionally avoided foreign key constraints until now because I wanted to help you understand relational data and relationships better before diving into foreign key constraints. -Now, let's explore how to enforce these relationships using foreign key constraints. +In this lesson, we will learn about foreign key constraints and see how they can help us enforce relationships between tables. -## What is a Foreign Key Constraint? +## What is a Foreign Key? -A foreign key constraint is a rule that ensures referential integrity between two tables. It prevents actions that would destroy relationships between tables or create invalid references. +If we take the same example of a bookstore from previous lesson, imagine we have two tables `author` and `author_biography`. -### What is referential integrity? +![](https://assets.roadmap.sh/guest/author-biography-8evoz.png) -Referential integrity is the idea that data +Notice how the `author_id` column in the `author_biography` table helps us link biography to an author. `author_id` in this case is a **foreign key**. -Let's use our bookstore example to understand foreign key constraints: +> Foreign keys are columns of a table that reference the primary key of another table. + +## Foreign Key Constraint + +Let me explain this with an example. Imagine, we defined the tables `author` and `author_biography` as follows: ```sql -CREATE TABLE books ( - id INTEGER PRIMARY KEY, - title VARCHAR(255) NOT NULL, - price DECIMAL(10, 2) +CREATE TABLE author ( + id INTEGER PRIMARY KEY, + name VARCHAR(255) NOT NULL +); + +CREATE TABLE author_biography ( + id INTEGER PRIMARY KEY, + author_id INTEGER, + biography TEXT ); +``` + +Notice, there are no constraints defined on the `author_biography` table. This means that there is nothing stopping us from inserting a biography without an author i.e. + +```sql +INSERT INTO author_biography (id, author_id, biography) +VALUES (1, 12, 'Biography of a non-existent author'); +``` + +Also, we can easily delete an author without deleting the biography. + +```sql +-- Delete the author without deleting the biography +DELETE FROM author WHERE id = 1; +``` + +This means that without constraints, we can easily create invalid references in our database when using foreign keys. Foreign key constraints can help us prevent this. + +A **foreign key constraint** is a special type of constraint that ensures referential integrity between two tables i.e. it prevents actions that destroy or create invalid references between tables. -CREATE TABLE sales ( +> ### What is referential integrity? +> +> Referential integrity is the property of a database that ensures the the correctness of data across tables. It helps us make sure that information is not removed from one table if it is required elsewhere in a linked database. + +## Implementing Foreign Key Constraints + +Let's add foreign key constraints to our `author` and `author_biography` tables. + +```sql +CREATE TABLE author ( id INTEGER PRIMARY KEY, - book_id INTEGER, - quantity INTEGER, - sale_date DATE, - FOREIGN KEY (book_id) REFERENCES books(id) + name VARCHAR(255) NOT NULL +); + +CREATE TABLE author_biography ( + id INTEGER PRIMARY KEY, + author_id INTEGER REFERENCES author(id), + biography TEXT, ); ``` -In this example: +Notice how we added `REFERENCES author(id)` in front of the `author_id` column in the `author_biography` table. This is a foreign key constraint and will ensure that: + +- We can't delete an author if there is a biography referencing it +- We can't insert a biography without an author + +Let's try to insert a biography without an author: + +```sql +-- ERROR: insert or update on table "author_biography" violates foreign key constraint "author_biography_author_id_fkey" +INSERT INTO author_biography (id, author_id, biography) +VALUES (1, 12, 'Biography of a non-existent author'); +``` + +Now if we insert the biography after inserting the author: + +```sql +INSERT INTO author (id, name) +VALUES (1, 'John Doe'); + +-- OK! +INSERT INTO author_biography (id, author_id, biography) +VALUES (1, 1, 'Biography of John Doe'); +``` + +Similarly, if we try to delete an author, it will fail if there is a biography referencing it: + +```sql +-- ERROR: delete or update on table "author" violates foreign key constraint "author_biography_author_id_fkey" on table "author_biography" +DELETE FROM author WHERE id = 1; +``` + +But if we delete the biography first, it will work just fine: -- The `book_id` column in the `sales` table is a foreign key -- It references the `id` column in the `books` table -- The foreign key constraint ensures that every `book_id` in `sales` must exist in the `books` table +```sql +-- OK! +DELETE FROM author_biography WHERE id = 1; +DELETE FROM author WHERE id = 1; +``` ### Alternative Syntax @@ -47,194 +133,314 @@ Just like other constraints, you can also define foreign keys using different sy ```sql -- Using CONSTRAINT keyword to name the foreign key -CREATE TABLE sales ( +CREATE TABLE author_biography ( id INTEGER PRIMARY KEY, - book_id INTEGER, - quantity INTEGER, - sale_date DATE, - CONSTRAINT fk_book - FOREIGN KEY (book_id) - REFERENCES books(id) + author_id INTEGER, + biography TEXT, + + -- Define the foreign key constraint + CONSTRAINT fk_author + FOREIGN KEY (author_id) + REFERENCES author(id) ); --- Inline syntax -CREATE TABLE sales ( +-- Creating unnamed foreign key +CREATE TABLE author_biography ( id INTEGER PRIMARY KEY, - book_id INTEGER REFERENCES books(id), - quantity INTEGER, - sale_date DATE + author_id INTEGER, + biography TEXT, + + -- Define the foreign key constraint + FOREIGN KEY (author_id) REFERENCES author(id) ); ``` -## How Foreign Keys Maintain Data Integrity +## Foreign Key Constraints in Action + +Let's revisit the examples above and see all the operations that foreign keys prevent in order to maintain data integrity. -Foreign keys prevent several types of invalid operations: +### Inserting Invalid References -### 1. Inserting Invalid References +We can't insert a biography linking to a non-existent author: ```sql --- This will fail if book_id=999 doesn't exist in books table -INSERT INTO sales (book_id, quantity, sale_date) -VALUES (999, 1, '2024-03-20'); +-- ERROR: insert or update on table "author_biography" violates foreign key constraint "author_biography_author_id_fkey" +INSERT INTO author_biography (id, author_id, biography) +VALUES (12, 999, 'Biography of a non-existent author'); ``` -### 2. Deleting Referenced Records +### Deleting Referenced Records By default, you cannot delete a record from the parent table if it's referenced in the child table: ```sql --- This will fail if any sales reference this book -DELETE FROM books WHERE id = 1; +-- This will fail if any author_biography references this author +DELETE FROM author WHERE id = 1; ``` -### 3. Updating Referenced Keys +### Updating Referenced Keys -Similarly, you cannot update a primary key if it's referenced by other tables: +Similarly, you cannot update the primary key of a parent table if it's referenced by other tables: ```sql --- This will fail if any sales reference this book -UPDATE books SET id = 999 WHERE id = 1; +-- This will fail if any author_biography references this author +UPDATE author SET id = 999 WHERE id = 1; ``` ## ON DELETE and ON UPDATE Clauses -You can specify what happens when a referenced record is deleted or updated using `ON DELETE` and `ON UPDATE` clauses: +In the examples above, we saw how foreign key constraints prevent operations deletion and updating of parent records. But we can configure our foreign key constraints to handle these operations automatically i.e. -```sql -CREATE TABLE sales ( - id INTEGER PRIMARY KEY, - book_id INTEGER, - quantity INTEGER, - sale_date DATE, - FOREIGN KEY (book_id) - REFERENCES books(id) - ON DELETE CASCADE - ON UPDATE CASCADE -); -``` +- When a parent record is deleted, the child records are automatically deleted e.g. delete all the `author_biography` referencing an `author` when the `author` is deleted. +- When the primary key of a parent record is updated, the foreign key in the child record is also updated e.g. update the `author_id` in the `author_biography` table when the `author_id` in the `author` table is updated. + +We can configure these operations by attaching `ON DELETE` and `ON UPDATE` clauses to our foreign key constraints with one of the following options: -Available options include: +| Option | Description | +| ------------- | -------------------------------------------------- | +| `NO ACTION` | Prevents the deletion/update (this is the default) | +| `RESTRICT` | Similar to `NO ACTION` with subtle differences | +| `CASCADE` | Automatically delete/update related records | +| `SET NULL` | Set the foreign key to NULL | +| `SET DEFAULT` | Set the foreign key to its default value. | -| Option | Description | -| ----------- | --------------------------------------------------- | -| CASCADE | Automatically delete/update related records | -| SET NULL | Set the foreign key to NULL | -| SET DEFAULT | Set the foreign key to its default value | -| RESTRICT | Prevent the deletion/update (this is the default) | -| NO ACTION | Similar to RESTRICT, but checked at different times | +Please note that `SET DEFAULT` is only available with `ON DELETE` clause and not `ON UPDATE` clause. -### Example with Multiple Options +Let's see some examples to see this in action. + +### CASCADE + +`CASCADE` is the most common option used to ensure referential integrity. It automatically deletes or updates related records when the parent record is deleted or updated. + +Let's recreate our `author` and `author_biography` tables with `CASCADE` option and some sample data: ```sql -CREATE TABLE books ( +-- parent table +CREATE TABLE author ( + id INTEGER PRIMARY KEY, + name VARCHAR(255) NOT NULL +); + +-- child table +CREATE TABLE author_biography ( id INTEGER PRIMARY KEY, author_id INTEGER, - title VARCHAR(255), + biography TEXT, FOREIGN KEY (author_id) - REFERENCES authors(id) - ON DELETE SET NULL -- If author is deleted, set author_id to NULL - ON UPDATE CASCADE -- If author_id changes, update it here too + REFERENCES author(id) + ON DELETE CASCADE -- If author is deleted, delete the biography + ON UPDATE CASCADE -- If id changes in author table, update it here too ); + +-- Setup the table with some data +INSERT INTO author (id, name) +VALUES (1, 'John Doe'), + (2, 'Jane Doe'); + +INSERT INTO author_biography (id, author_id, biography) +VALUES (1, 1, 'Biography of John Doe'), + (2, 2, 'Biography of Jane Doe'); ``` -## Composite Foreign Keys +This will result in the following data in `author` and `author_biography` tables respectively: -Just like primary keys, foreign keys can also consist of multiple columns: +| id | name | +| --- | -------- | +| 1 | John Doe | +| 2 | Jane Doe | + +| id | author_id | biography | +| --- | --------- | --------------------- | +| 1 | 1 | Biography of John Doe | +| 2 | 2 | Biography of Jane Doe | + +#### Deleting the Author + +Now, let's delete the author and see how it affects the `author_biography` table: ```sql -CREATE TABLE book_editions ( - book_id INTEGER, - edition_number INTEGER, - publisher_id INTEGER, - publication_year INTEGER, - PRIMARY KEY (book_id, edition_number), - FOREIGN KEY (book_id, edition_number) - REFERENCES books(id, edition) -); +-- This will automatically delete the biography of John Doe +DELETE FROM author WHERE id = 1; ``` -## Best Practices +Now if we check the `author_biography` table, we will see that the biography of John Doe has been deleted automatically: -1. **Always Name Your Constraints**: This makes error messages more meaningful and maintenance easier: +| id | author_id | biography | +| --- | --------- | --------------------- | +| 2 | 2 | Biography of Jane Doe | - ```sql - CONSTRAINT fk_book_author - FOREIGN KEY (author_id) - REFERENCES authors(id) - ``` +#### Updating the Author -2. **Consider Indexing**: Foreign key columns are often used in JOIN operations, so consider adding indexes: +Similarly, if we update the `id` of the author, it will automatically update the `author_id` in the `author_biography` table. For example, let's update the id of the remaining author from 2 to 999: - ```sql - CREATE INDEX idx_book_author ON books(author_id); - ``` - -3. **Choose ON DELETE/UPDATE Actions Carefully**: +```sql +UPDATE author SET id = 999 WHERE id = 2; +``` - - Use `CASCADE` when child records cannot exist without parent - - Use `SET NULL` when child records can exist independently - - Use `RESTRICT` when you want to prevent accidental deletions +Now if we check the `author_biography` table, we will see that the `author_id` has been updated to 999: -4. **Maintain Data Consistency**: Always insert parent records before child records and remove child records before parent records. +| id | author_id | biography | +| --- | --------- | --------------------- | +| 2 | 999 | Biography of Jane Doe | -## Types of Joins Using Foreign Keys +### SET DEFAULT -When working with foreign key relationships, you can use different types of JOIN operations to combine data from related tables: +`SET DEFAULT` sets the foreign key to its default value when the parent record is deleted. -### INNER JOIN +> `SET DEFAULT` is not available with `ON UPDATE` clause. -Returns only the matching records from both tables: +Let's see a simple example to understand this better. ```sql -SELECT books.title, sales.quantity -FROM books -INNER JOIN sales ON books.id = sales.book_id; +CREATE TABLE parent ( + id INTEGER PRIMARY KEY, + name VARCHAR(255) NOT NULL +); + +CREATE TABLE child ( + id INTEGER PRIMARY KEY, + parent_id INTEGER DEFAULT 999, + + FOREIGN KEY (parent_id) + REFERENCES parent(id) + ON DELETE SET DEFAULT -- If parent is deleted, set to 999 + ON UPDATE CASCADE -- If parent id is updated, update it here too +); + +-- insert some data +INSERT INTO parent (id, name) +VALUES (1, 'John Doe'), + (2, 'Jane Doe'), + (999, 'Unknown'); -- we will use this as a default value + +INSERT INTO child (id, parent_id) +VALUES (1, 1), + (2, 2); ``` -### LEFT JOIN (LEFT OUTER JOIN) +If you look closely, we have inserted a parent record with id `999` and value `Unknown`. We will use `999` as a default value for the `parent_id` column if the parent record is deleted or updated. -Returns all records from the left table and matching records from the right table: +Now if we delete the parent record with id `1`, the `child` record with `parent_id` `1` will be updated to `999`: ```sql -SELECT books.title, COALESCE(sales.quantity, 0) as quantity -FROM books -LEFT JOIN sales ON books.id = sales.book_id; +DELETE FROM parent WHERE id = 1; ``` -### RIGHT JOIN (RIGHT OUTER JOIN) +Now if we check the `child` table, we will see that the `parent_id` has been updated to `999`: + +| id | parent_id | +| --- | --------- | +| 2 | 2 | +| 1 | 999 | + +An important thing to note here is that the `parent` record with id `999` must exist before we can delete the `parent` record with id `1`. + +### SET NULL -Returns all records from the right table and matching records from the left table: +`SET NULL` sets the foreign key to `NULL` when the parent record is deleted or updated. + +Taking the same example as above: ```sql -SELECT authors.name, books.title -FROM books -RIGHT JOIN authors ON books.author_id = authors.id; -``` +CREATE TABLE parent ( + id INTEGER PRIMARY KEY, + name VARCHAR(255) NOT NULL +); + +CREATE TABLE child ( + id INTEGER PRIMARY KEY, + parent_id INTEGER, + FOREIGN KEY (parent_id) + REFERENCES parent(id) + ON DELETE SET NULL -- If parent is deleted, set to NULL + ON UPDATE SET NULL -- If parent id is updated, set to NULL +); + +-- insert some data +INSERT INTO parent (id, name) +VALUES (1, 'John Doe'), + (2, 'Jane Doe'); -### FULL JOIN (FULL OUTER JOIN) +INSERT INTO child (id, parent_id) +VALUES (1, 1), + (2, 2); +``` -Returns all records from both tables, matching where possible: +Let's delete the parent record with id `1`: ```sql -SELECT books.title, reviews.rating -FROM books -FULL OUTER JOIN reviews ON books.id = reviews.book_id; +DELETE FROM parent WHERE id = 1; ``` -### Cross Join +Now if we check the `child` table, we will see that the `parent_id` has been set to `NULL`: + +| id | parent_id | +| --- | --------- | +| 2 | 2 | +| 1 | `NULL` | -Returns the Cartesian product of both tables (every possible combination): +Similarly, if we update the `id` of the parent record with id `2` to `999`, the `child` record with `parent_id` `2` will be updated to `NULL`: ```sql -SELECT books.title, categories.name -FROM books -CROSS JOIN categories; +UPDATE parent SET id = 999 WHERE id = 2; ``` -> 💡 **Note**: The type of JOIN you choose depends on your specific needs: -> - Use INNER JOIN when you only want matching records -> - Use LEFT/RIGHT JOIN when you need all records from one table -> - Use FULL JOIN when you need all records from both tables -> - Use CROSS JOIN rarely, typically for generating combinations +Now if we check the `child` table, we will see that the `parent_id` has been set to `NULL`: + +| id | parent_id | +| --- | --------- | +| 2 | `NULL` | +| 1 | `NULL` | + +This is useful when you want to allow the child records to exist independently of the parent record. + +### RESTRICT vs NO ACTION + +`NO ACTION` and `RESTRICT` both prevent the deletion or update of the parent record if it has related records in the child table. The difference between the two is the times at which they are checked. + +For single queries, i.e. the ones we have been learning so far there is no difference between `NO ACTION` and `RESTRICT`. However, when using transactions: + +- `RESTRICT` is checked immediately when the query is performed. Transaction is rolled back immediately if the constraint is violated with an action. +- `NO ACTION` is checked at the end of the transaction i.e. if data gets fixed during the transaction, there won't be an error. + +We haven't covered transactions in this course yet. Let's quickly go over what they are to help you understand this difference better. + +> ### What is a transaction? +> +> A transaction is a sequence of operations performed as a single logical unit of work. This is particularly useful when you want to ensure that a series of operations are successful before committing them to the database. +> +> For example, if you are transferring money from one account to another, you want to ensure that the transfer is successful before updating both accounts. If any of the operations fail, the entire transaction is rolled back. +> +> We will learn more about transactions in the future lessons. + +### Choosing the Right Option + +The choice of option depends on your use case. The table below should give you an idea of when to use what: + +| Option | When to Use | +| ------------- | ---------------------------------------------- | +| `CASCADE` | When child records cannot exist without parent | +| `SET NULL` | When child records can exist independently | +| `RESTRICT` | When you want to prevent accidental deletions | +| `NO ACTION` | When you want to prevent accidental deletions | +| `SET DEFAULT` | You have a default value for the foreign key | + +--- + +## Composite Foreign Keys + +Just like primary keys, foreign keys can also consist of multiple columns: + +```sql +CREATE TABLE book_editions ( + book_id INTEGER, + edition_number INTEGER, + publisher_id INTEGER, + publication_year INTEGER, + + PRIMARY KEY (book_id, edition_number), + FOREIGN KEY (book_id, edition_number) + REFERENCES books(id, edition) +); +``` -In the next lesson, we'll learn how to query data across multiple tables using these relationships.