Normalization is a database design technique used to eliminate data redundancy and improve data integrity. In this guide, we'll delve into the concepts of normalization, focusing on the First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF). We'll also provide sample SQL code examples to illustrate the principles.
What is Normalization?
Normalization is the process of organizing data in a relational database to minimize data redundancy and dependency. It involves breaking down a large table into smaller, related tables and establishing relationships between them.
First Normal Form (1NF)
1NF requires that a table meets the following criteria:
- Each column in a table must have a unique name.
- Each column must hold atomic (indivisible) values.
- Each entry in a column should be of the same data type.
- Each column must have a unique value in each row.
Here's a sample SQL table that violates 1NF:
-- Table not in 1NF
CREATE TABLE Employee (
EmployeeID INT,
EmployeeName NVARCHAR(255),
Skills NVARCHAR(255)
);
Second Normal Form (2NF)
2NF builds upon 1NF and introduces the concept of a primary key. A table is in 2NF if it:
- Is in 1NF.
- Has a primary key that uniquely identifies each row.
- Columns depend on the entire primary key and not on only part of it.
Here's an example of a table in 2NF:
-- Table in 2NF
CREATE TABLE Orders (
OrderID INT,
ProductID INT,
Quantity INT,
PRIMARY KEY (OrderID, ProductID)
);
Third Normal Form (3NF)
3NF builds upon 2NF and further reduces data redundancy. A table is in 3NF if it:
- Is in 2NF.
- Has columns that depend only on the primary key and not on other non-key columns.
Here's an example of a table in 3NF:
-- Table in 3NF
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
CustomerName NVARCHAR(255),
City NVARCHAR(255)
);
What's Next?
You've learned the basics of normalization in SQL Server, including 1NF, 2NF, and 3NF. To become proficient, you can explore advanced normalization techniques, database design tools, and denormalization for specific use cases.
Normalization is a crucial concept in database design that helps maintain data integrity and efficiency.