What Is Database Normalization And Denormalization

In this article, I am going to explain the database and different normalization forms and denormalization.

A database is a software system that is used to store information about an organization or an enterprise or user's activities, or any other business activities, for the purposes of using that later.
 

RDBMS

 
RDBMS stands for Relational Database Management System. Relational database management system is based on a relational model. In RDBMS database, data stores in the form of tables. Tables contain rows and columns and maintain relations to other tables. Most used RDBMSs are SQL Server, Oracle, MySQL, etc. In RDBMS, each table has a unique name. Table has columns with unique names and collections of rows which contain the data related to the same entity.
 
For example, the Employee table has stored employee’s data, like EmployeeName, Address, Designation, Salary etc.
 
EmpId
EmployeeName
 Designation
Salary
 Address City  ProjectAssigned
 Emp001
Arvind singh Baghel
 Consultant
1000
 NDA road house no. 1  Pune  ERP
 Emp002
John
 Developer
500
 Fos street lane no. 2 house no. 5  Paris  Billing System
 Emp003
Vicky Singh
 Manager
1200
 Rajeev nagar lane 2 hose 10  Delhi  ERP, Billing System
 Emp004 Rambo  Consultant 1000  s2 street house 4  Redmond  Billing System
 

Database Normalization

 
Database normalization is the process which helps the database designer to design a relational database with minimum redundant data. Normalization is the process of data organization in the form of tables. The normalization process includes dividing data into tables and making relationships among them.
 
Normalization process follows many steps. We generally call them normalization forms and they come one after another but mostly database gets normalized till 3rd normal form.
 
 
1st Normal Form
 
In the 1st normal form data is divided into tables and each row should have unique data or table assign a primary key to uniquely identify the row. Table is called in 1st normal form if, Repeating Groups of columns Eliminated into the separate tables of related attributes. Each row should be unique in the table, or table has a primary key.
 
Let’s take example of above employee table, for 1st normal form, add new column in the table as ProjectId and set EmpId and ProjectId as primary key with combination of two column to uniquely identify row. And repeating columns and multi-value columns can be separate in another table. Project column has multi valued data and if we separate it to another column then it will be Project1, Project2 which represent repeating groups, so we must separate the project data to another table.
 
EmpId
EmpName
Designation
Salary
Address
City
PostalCode
Emp001
Arvind singh Baghel
Consultant
1000
Warje NDA road house 1
Pune
411058
Emp002
John
Developer
500
Avenue du Colonel Henri
Paris
75014
Emp003
Vinod Singh
Manager
1200
Rajeev nagar lane 2
Delhi
110086
Emp004
Rambo
Consultant
1000
NE 97th st
Redmond
98052
 
ProjectDetail table: EmpId and ProjectId can set composite primary key.
 
EmpId
ProjectId
Project
Emp001
E1
ERP
Emp002
BS1
Billing System
Emp003
E1
ERP
Emp003
BS1
Billing System
Emp004
BNS1
Banking System
 
2nd Normal Form
 
For the 2nd normal form table must satisfy all the condition of the 1st normal form and redundant data should be eliminated to the separate table. Table is called in 2nd normal form if, table meet all the criteria of 1st normal form and subset of the data or redundant data is separated into another table and created relationship between these tables using the foreign key.
 
Let’s continue with above table example, The EmployeeProjectDetail table has redundant data and multicolumn primary key. For the 2nd normal form table should not have redundant data and should avoid multi column primary key. Create another table as Project with two columns ProjectName and ProjectId and set ProjectId as primary key.
 
Project table
 
ProjectId
Project
E1
ERP
BS1
Billing System
BNS1
Banking System
 
To make relation of Employee table with Project table create another table with two columns EmployeeId and Project code. EmpId is foreign key reference to Employee table EmpId primary key and ProjectId is foreign key reference to Project table primary key.
 
EmployeeProjectDetails
 
EmpId
ProjectId
Emp001
E1
Emp002
BS1
Emp003
BS1
Emp003
E1
Emp004
BNS1
 
3rd Normal Form
 
For the 3rd normal form table must satisfy all the requirement of the 2nd normal form and Columns which are not directly dependent on the primary key is separated to another table. There should not be transitive functional dependency. Transitive dependency should be removed and separated to another table. If a non-key column depends on another non-key column, that is called a transitive dependency.
Let’s continue with Employee table example, Transitive dependency can be found in the employee table. City column does not directly depend on the EmpId, if postal code change then city also change. It means city depend on postal code. These columns can be separate to another table. Create New table with two columns postalcode named it CityDetails and set postalcode as primary key.
 
CityDetails
 
PostalCode
City
411058
Pune
75014
Paris
110086
Delhi
98052
Redmond
 
Employee table
 
Postal code is foreign key in the Employee table to join with CityDetails table.
 
EmpId
EmpName
Designation
Salary
Address
PostalCode
Emp001
Arvind singh Baghel
Consultant
1000
Warje NDA road house 1
411058
Emp002
John
Developer
500
Avenue du Colonel Henri
75014
Emp003
Vinod Singh
Manager
1200
Rajeev nagar lane 2
110086
Emp004
Rambo
Consultant
1000
NE 97th st
98052
 
Transitive dependency has been removed, but if city change then the address also changes, so create separate table for address with AddressId, Address, Postalcode and EmpId column. Postal code is foreign key in the Address table to join with CityDetails table and EmpId is foreign key to join Employee table.
 
AddressDetails
 
AddressId
Address
PostalCode
EmpId
A001
Warje NDA road house 1
411058
Emp001
A002
Avenue du Colonel Henri
75014
Emp002
A003
Rajeev nagar lane 2
110086
Emp003
A004
NE 97th st
98052
Emp004
 
Update Employee table remove Address and postalCode from the table.
 
Employee table
 
EmpId
EmpName
Designation
Salary
Emp001
Arvind singh Baghel
Consultant
1000
Emp002
John
Developer
500
Emp003
Vinod Singh
Manager
1200
Emp004
Rambo
Consultant
1000
 
The database can normalize more in 4th Normal Form also called Boyce-Code Normal Form, 5th Normal Form, 6th normal form, but these are rarely used.
 

Denormalization

 
Denormalization is the opposite process of normalization. Denormalization is used on the normalized database to increase the performance by combining tables to reduce the joins so that the data can be fetched without joins, which increases the performance of data fetching.