Foreign Keys and Referential Integrity
A foreign key relationship allows you to declare that an index in one table is related to an index in another and allows you to place constraints on what may be done to the table containing the foreign key. The database enforces the rules of this relationship to maintain referential integrity. For example, the score table in the sampdb sample database contains a student_id column, which we use to relate score records to students in the student table. When we created these tables in Chapter 1, we did not set up any explicit relationship between them. Were we to do so, we would declarescore.student_id to be a foreign key for the student.student_id column. That prevents a record from being entered into the score table unless it contains a student_id value that exists in thestudent table. (In other words, the foreign key prevents entry of scores for non-existent students.) We could also set up a constraint such that if a student is deleted from the student table, all corresponding records for the student in the score table should be deleted automatically as well. This is called cascaded delete because the effect of the delete cascades from one table to another.
Foreign keys help maintain the consistency of your data, and they provide a certain measure of convenience. Without foreign keys, you are responsible for keeping track of inter-table dependencies and maintaining their consistency from within your applications. In many cases, doing this isn't really that much work. It amounts to little more than adding a few extra DELETE statements to make sure that when you delete a record from one table, you also delete the corresponding records in any related tables. But if your tables have particularly complex relationships, you may not want to be responsible for implementing these dependencies in your applications. Besides, if the database engine will perform consistency checks for you, why not let it?
Foreign key support in MySQL is provided by the InnoDB table handler. This section describes how to set up InnoDB tables to define foreign keys, and how foreign keys affect the way you use tables. But first, it's necessary to define some terms:
-
The parent is the table that contains the original key values.
-
The child is the related table that refers to key values in the parent.
-
Parent table key values are used to associate the two tables. Specifically, the index in the child table refers to the index in the parent. Its values must match those in the parent or else be set to NULL to indicate that there is no associated parent table record. The index in the child table is known as the foreign key—that is, the key that is foreign (external) to the parent table but contains values that point to the parent. A foreign key relationship can be set up to disallow NULL values, in which case all foreign key values must match a value in the parent table.
InnoDB enforces these rules to guarantee that the foreign key relationship stays intact with no mismatches. This is called referential integrity.
The syntax for declaring a foreign key in a child table is as follows, with optional parts shown in square brackets:
FOREIGN KEY [index_name] (index_columns)
REFERENCES tbl_name (index_columns)
[ON DELETE action]
[ON UPDATE action]
[MATCH FULL | MATCH PARTIAL]
Note that although all parts of this syntax are parsed, InnoDB does not implement the semantics for all the clauses. The ON UPDATE and MATCH clauses are not supported and are ignored if you specify them.1.
The parts of the definition that InnoDB pays attention to are:
-
FOREIGN KEY indicates the columns that make up the index in the child table that must match index values in the parent table. index_name, if given, is ignored.
-
REFERENCES names the parent table and the index columns in that table that correspond to the foreign key in the child table. The index_columns part of the REFERENCES clause must have the same number of columns as the index_columns that follows the FOREIGN KEYkeywords.
-
ON DELETE allows you to specify what happens to the child table when parent table records are deleted. The possible actions are as follows:
-
ON DELETE CASCADE causes matching child records to be deleted when the corresponding parent record is deleted. In essence, the effect of the delete is cascaded from the parent to the child. This allows you to perform multiple-table deletions by deleting rows only from the parent table and letting InnoDB take care of deleting rows from the child table.
-
ON DELETE SET NULL causes index columns in matching child records to be set to NULL when the parent record is deleted. If you use this option, all the child table columns named in the foreign key definition must be declared to allowNULL values. (One implication of using this action is that you cannot declare the foreign key to be a PRIMARY KEY because primary keys do not allow NULLvalues.)
To define a foreign key, adhere to the following guidelines:
-
The child table must have an index where the foreign key columns are listed as its first columns. The parent table must also have an index in which the columns in the REFERENCESclause are listed as its first columns. (In other words, the columns in the key must be indexed in the tables on both ends of the foreign key relationship.) You must specify these indexes explicitly in the parent and child tables. InnoDB will not create them for you.
-
Corresponding columns in the parent and child indexes must have compatible types. For example, you cannot match an INT column with a CHAR column. Corresponding character columns must be the same length. Corresponding integer columns must have the same size and must both be signed or both UNSIGNED.
Let's see an example of how all this works. Begin by creating tables named parent and child, such that the child table contains a foreign key that references the par_id column in the parent table:
CREATE TABLE parent
(
par_id INT NOT NULL,
PRIMARY KEY (par_id)
) TYPE = INNODB;
CREATE TABLE child
(
par_id INT NOT NULL,
child_id INT NOT NULL,
PRIMARY KEY (par_id, child_id),
FOREIGN KEY (par_id) REFERENCES parent (par_id) ON DELETE CASCADE
) TYPE = INNODB;
The foreign key in this case uses ON DELETE CASCADE to specify that when a record is deleted from the parent table, child records with a matching par_id value should be removed automatically as well.
Now insert a few records into the parent table and add some records that have related key values to the child table:
mysql> INSERT INTO parent (par_id) VALUES(1),(2),(3);
mysql> INSERT INTO child (par_id,child_id) VALUES(1,1),(1,2);
mysql> INSERT INTO child (par_id,child_id) VALUES(2,1),(2,2),(2,3);
mysql> INSERT INTO child (par_id,child_id) VALUES(3,1);
These statements result in the following table contents, where each par_id value in the child table matches a par_id value in the parent table:
mysql> SELECT * FROM parent;
+--------+
| par_id |
+--------+
| 1 |
| 2 |
| 3 |
+--------+
mysql> SELECT * FROM child;
+--------+----------+
| par_id | child_id |
+--------+----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
+--------+----------+
To verify that InnoDB enforces the key relationship for insertion, try adding a record to the child table that has a par_id value not found in the parent table:
mysql> INSERT INTO child (par_id,child_id) VALUES(4,1);
ERROR 1216: Cannot add a child row: a foreign key constraint fails
Now see what happens when you delete a parent record:
mysql> DELETE FROM parent where par_id = 1;
MySQL deletes the record from the parent table:
mysql> SELECT * FROM parent;
+--------+
| par_id |
+--------+
| 2 |
| 3 |
+--------+
In addition, it cascades the effect of the DELETE statement to the child table:
mysql> SELECT * FROM child;
+--------+----------+
| par_id | child_id |
+--------+----------+
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
+--------+----------+
The preceding example shows how to arrange to have deletion of a parent record cause deletion of any corresponding child records. Another possibility is to let the child records remain in the table but have their foreign key columns set to NULL. To do this, it's necessary to make three changes to the definition of the child table:
-
Use ON DELETE SET NULL rather than ON DELETE CASCADE. This tells InnoDB to set the foreign key column (par_id) to NULL instead of deleting the records.
-
The original definition of child declares par_id as NOT NULL. That won't work with ON DELETE SET NULL, of course, so the column must be declared NULL instead.
-
The original definition of child also declares par_id to be part of a PRIMARY KEY. However, a PRIMARY KEY cannot contain NULL values. Therefore, changing par_id to allowNULL also requires that the PRIMARY KEY be changed to a UNIQUE index. InnoDB UNIQUEindexes enforce uniqueness except for NULL values, which may occur multiple times in the index.
To see the effect of these changes, recreate the parent table using the original definition and load the same initial records into it. Then create the child table using the new definition shown here:
CREATE TABLE child
(
par_id INT NULL,
child_id INT NOT NULL,
UNIQUE (par_id, child_id),
FOREIGN KEY (par_id) REFERENCES parent (par_id) ON DELETE SET NULL
) TYPE = INNODB;
With respect to inserting new records, the child table behaves the same. That is, it allows insertion of records with par_id values found in the parent table, but disallows entry of values that aren't listed there:2
mysql> INSERT INTO child (par_id,child_id) VALUES(1,1),(1,2);
mysql> INSERT INTO child (par_id,child_id) VALUES(2,1),(2,2),(2,3);
mysql> INSERT INTO child (par_id,child_id) VALUES(3,1);
mysql> INSERT INTO child (par_id,child_id) VALUES(4,1);
ERROR 1216: Cannot add a child row: a foreign key constraint fails
A difference in behavior occurs when you delete a parent record. Try removing a parent record and then check the contents of the child table to see what happens:
mysql> DELETE FROM parent where par_id = 1;
mysql> SELECT * FROM child;
+--------+----------+
| par_id | child_id |
+--------+----------+
| NULL | 1 |
| NULL | 2 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
+--------+----------+
In this case, the child records that had 1 in the par_id column are not deleted. Instead, the par_idcolumn is set to NULL, as specified by the ON DELETE SET NULL constraint.
Foreign key capabilities did not all appear at the same time, as shown in the following table. The initial foreign key support prevents insertion or deletion of child records that violate key constraints. The other features were added later.
The parent is the table that contains the original key values.
The child is the related table that refers to key values in the parent.
Parent table key values are used to associate the two tables. Specifically, the index in the child table refers to the index in the parent. Its values must match those in the parent or else be set to NULL to indicate that there is no associated parent table record. The index in the child table is known as the foreign key—that is, the key that is foreign (external) to the parent table but contains values that point to the parent. A foreign key relationship can be set up to disallow NULL values, in which case all foreign key values must match a value in the parent table.
FOREIGN KEY indicates the columns that make up the index in the child table that must match index values in the parent table. index_name, if given, is ignored.
REFERENCES names the parent table and the index columns in that table that correspond to the foreign key in the child table. The index_columns part of the REFERENCES clause must have the same number of columns as the index_columns that follows the FOREIGN KEYkeywords.
ON DELETE allows you to specify what happens to the child table when parent table records are deleted. The possible actions are as follows:
- ON DELETE CASCADE causes matching child records to be deleted when the corresponding parent record is deleted. In essence, the effect of the delete is cascaded from the parent to the child. This allows you to perform multiple-table deletions by deleting rows only from the parent table and letting InnoDB take care of deleting rows from the child table.
- ON DELETE SET NULL causes index columns in matching child records to be set to NULL when the parent record is deleted. If you use this option, all the child table columns named in the foreign key definition must be declared to allowNULL values. (One implication of using this action is that you cannot declare the foreign key to be a PRIMARY KEY because primary keys do not allow NULLvalues.)
The child table must have an index where the foreign key columns are listed as its first columns. The parent table must also have an index in which the columns in the REFERENCESclause are listed as its first columns. (In other words, the columns in the key must be indexed in the tables on both ends of the foreign key relationship.) You must specify these indexes explicitly in the parent and child tables. InnoDB will not create them for you.
Corresponding columns in the parent and child indexes must have compatible types. For example, you cannot match an INT column with a CHAR column. Corresponding character columns must be the same length. Corresponding integer columns must have the same size and must both be signed or both UNSIGNED.
Use ON DELETE SET NULL rather than ON DELETE CASCADE. This tells InnoDB to set the foreign key column (par_id) to NULL instead of deleting the records.
The original definition of child declares par_id as NOT NULL. That won't work with ON DELETE SET NULL, of course, so the column must be declared NULL instead.
The original definition of child also declares par_id to be part of a PRIMARY KEY. However, a PRIMARY KEY cannot contain NULL values. Therefore, changing par_id to allowNULL also requires that the PRIMARY KEY be changed to a UNIQUE index. InnoDB UNIQUEindexes enforce uniqueness except for NULL values, which may occur multiple times in the index.
Feature
|
Version
|
Basic foreign key support
|
3.23.44
|
ON DELETE CASCADE
|
3.23.50
|
ON DELETE SET NULL
|
3.23.50
|
You can infer from the table that for the most complete foreign key feature support, it's best to use a version of MySQL at least as recent as 3.23.50 if at all possible. Another reason to use more recent versions is that the following problems were not rectified until MySQL 3.23.50:
- It is dangerous to use ALTER TABLE or CREATE INDEX to modify an InnoDB table that participates in foreign key relationships in either a parent or child role. The statement removes the foreign key constraints.
- SHOW CREATE TABLE does not show foreign key definitions. This also applies to mysqldump, which makes it problematic to properly restore tables that include foreign keys from backup files.