When you’re storing data in a database, you need to have some way to identify each record that you are storing. Otherwise, it would be impossible to retrieve exactly the right record if it shared the same values as other records. That’s where primary keys are useful.
In database lingo, a primary key is a unique identifier. Each record in a table is assigned a primary key. This means that there will always be at least one difference between each record, so you can retrieve a record individually.
In this guide, we’re going to discuss what primary keys are, how they work, and why they are used in a database. We’ll walk through an example to illustrate a primary key in action.
What is a Primary Key?
A primary key uniquely identifies a record in a table.
Primary keys are usually integer values, and increment by one for each record in the table. That means the first record will have the primary key 1, the second will have the primary key 2, and so on. This number is guaranteed to be unique; no two records can share it.
Often, primary keys are automatically generated by a database management system, such as
Microsoft Access or
PostgreSQL. This ensures that you don’t have to do any leg-work; the keys are already generated for you.
How Do Primary Keys Work?
Imagine that you are creating a database with records on a list of employees employed by a business. These records are stored in a table called
In this table, you store: the name of each employee, their email address, and the department to which they are assigned.
Alongside each record, you decide to store a primary key. This key must be unique to every record. In this case, it may be appropriate to assign a primary key equal to an employee’s ID number. This is because no two employees are likely to have the same ID.
Our table, with a primary key, may look like this:
In this database we have two entries for John Lucas. It just so happens that there are two employees on the sales team called John Lucas. While we can technically identify them by their emails, it’s more appropriate to use their employee IDs. This is because it’s easier to query a database based on a unique set of ID numbers than it is to find out an employee’s email then search for it.
What is a Foreign Key?
A foreign key connects records between two tables in a database.
You may want to reference a record from one table in another table.
For instance, we may want to create a departments table which stores a list of all the departments that operate in the business. We could then update our
EMPLOYEES table to use those IDs. This would allow us to store the department with which every employee is related without having to manually write out each department’s name.
Consider this example:
In this example, we’ve used both primary and foreign keys. The primary keys in our tables are in the ID column. In the employees table, ID is the unique employee ID that has been assigned to them by the business. In the departments table, ID is a number that increments from 1.
Instead of writing in
Marketing as the departments in the
Employees table, we’ve created a connection between the employees table and the departments table. This means that we don’t have to write out full-form department names: we just need to specify the ID of the department with which an employee is associated.
Department is our foreign key. The value stored in the foreign key column must always be a valid record in the departments table, which is the parent table associated with Departments.
How Should You Choose a Primary Key?
The number one rule when choosing a primary key is that it should be relevant. Make sure that the primary key you choose cannot be a duplicate in any way.
We chose the employee ID example from above because the hypothetical employer assigns each employee a unique ID. If these IDs were reassigned to new employees after one left, it would no longer be a good primary key.
Primary keys should be short and simple. Having a short and simple primary key makes it easy to identify records in the database when you are writing queries.
That’s not all! You should also try to use numbers whenever possible. Numbers are easier to work with when you’re writing queries and can be processed faster than character data types.
What is Referential Integrity?
Referential integrity describes a set of rules that ensures the relationship between two or more tables remains accurate even when changes are made to the database.
In our last example, we stored data in two tables: employees and departments. We created a link between these tables using the foreign key Department. This foreign key identifies the department with which an employee is associated.
When you create a foreign key, referential integrity rules kick in. It’s a long and complicated phrase, but it’s important to know when you are working with primary keys.
There are a few rules that you need to know about.
The first is that with referential integrity, if you modify a value in one table, all the records in the table where that primary key is a foreign key will be changed. Let’s say that we change the value of the department with the ID 1 to
Management. This means that whenever we reference that record in our employees table, its value will become management.
Referential integrity also protects you from adding in incorrect details into a table. Consider our last example. If you try to assign an employee to the department with the ID 10, an error will be raised. That’s because there is no department with that ID.
Primary keys are used to uniquely identify records in a table. Primary keys can be referenced in other tables to create a connection between them. This is accomplished by assigning a foreign key within a particular table.
Now you’re ready to start working with primary keys like an expert!