Normalization In SQL

What Is Normalization?

The process of removing data redundancy and improving data integrity in a table is known as normalization. It is a multi-step process that converts data from relational tables into the tabular format and removes duplicate data.

In 1970 Edgar F. Codd defined the First Normal Form.

Let’s understand the types of Normal forms with the help of examples.

1st Normal Form (1NF)

If a table’s atomicity is 1, it is said to be in its First Normal Form.

Atomicity means a  single cell cannot hold multiple values.  It can only have one attribute with a single value.

The First normal form not allowed the multi-valued attribute, composite attribute, and their combinations.

Let’s understand with the help of an example.

The PhoneNumber column in the table contains two values. As a result, it does not follow the First Normal Form.

By applying the First Normal Form, you achieve atomicity, and also every column has unique values as shown below the result table.

Learn about Candidate Key and Super Key before moving on to the Second Normal Form.

Candidate Key

A candidate key is a collection of one or more columns that can be used to uniquely identify a record in a table, and YOU can use each candidate key as a Primary Key.

Super Key

The Super Key is a collection of more than one key that can uniquely identify a record in a table, and the Primary Key is a subset of the Super Key.

2nd Normal Form (2NF)

The first requirement for the table to be in Second Normal Form is that it be in First Normal Form. There should be no partial dependencies in the table.

What is Partial Dependency

If the proper subset of the candidate key determines non-prime attributes (attributes that are not part of any candidate key), it is called partial dependency.

Let’s Understand the  2nd NF with the help of an example.

This table has a composite primary key consisting of Employee ID and Department ID. Office Location is a non-critical attribute. In this case, Office Location is solely determined by Department ID, which is only a component of the primary key. As a result, this table fails to satisfy the second Normal Form.

To convert this table to Second Normal Form, we must divide it into two parts. This yields the following tables.

As you can see, we have removed the initial partial functional dependency. Now, in the table, the column Office Location is completely dependent on the table’s primary key, which is Department ID.

3rd Normal Form (3NF)

The first requirement for the table to be in the Third Normal Form is that it be in the Second Normal Form. The table has to be in 2NF before proceeding to 3NF.
The second condition is that there should be no transitive dependency for non-prime attributes, which means that non-prime attributes (those that are not part of the candidate key) should not be dependent on other non-prime attributes in a table.

Transitive dependency is a functional dependency in which X → Z (X determines Z) indirectly, through X → Y and Y → Z(where Y → X does not exist).

Let’s Understand this with the help of an example.

Student ID determines Subject ID in the above table, and Subject ID determines Subject.

As a result, Student ID determines Subject through Subject ID. This implies a transitive functional dependency, and the structure does not satisfy the third normal form.

To achieve the third normal form, we must divide the table as shown below:

As shown in the preceding tables, all non-key attributes are now fully functional and reliant solely on the primary key. Columns Student Name, Subject ID, and Address in the first table are only dependent on Student ID. The subject is only dependent on the Subject ID in the second table.

Boyce Codd Normal Form (BCNF)

BCNF is also known as 3.5 NF.

It is the improved version of 3NF and was created by Raymond F. Boyce and Edgar F. Codd to address certain types of anomalies that 3NF did not address.

The table must satisfy the 3rd Normal Form before proceeding to BCNF.

Every Right-Hand Side (RHS) attribute of the functional dependencies should be dependent on the table’s super key.

In BCNF, if every functional dependency A → B, then A must be the table’s Super Key.

Let’s Understand this with the help of an example.

The following conditions apply to the subject table:

  • Each student can enroll in multiple subjects.
  • Multiple professors can teach a particular subject.
  • It assigns a professor for each subject to the student.

Except for BCNF, all of the normal forms are satisfied in this table.?

As you can see, the primary key consists of Student ID and Subject, indicating that the Subject column is a primary attribute. But there is one more requirement, Professor Subject.

Furthermore, while Subject is a prime attribute, Professor is a non-prime attribute that is not permitted by BCNF.

To satisfy the BCNF, we will divide the table into two sections. One table will contain the already existing column Student ID as well as the newly created column Professor ID.

The second table contains the columns Professor ID, subject, and professor, which satisfy the BCNF.

We have satisfied the Boyce Codd Normal Form.

Submit a Comment

Your email address will not be published.

Subscribe

Select Categories