Deleting Duplicate Records in SQL Server
In SQL Server, deleting duplicate records can be a challenging task, especially when dealing with large datasets. Here are a few methods to delete duplicate records, along with examples.
Method 1: Using ROW_NUMBER() Function
This method uses the ROW_NUMBER() function to assign a unique number to each row within a partition of the result set. We can then delete the duplicate records based on the row number.
WITH duplicates AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2, ... ORDER BY column1) AS row_num FROM your_table ) DELETE FROM duplicates WHERE row_num > 1;
Example:
Let's say we have a table called Customers
with duplicate records based on the Name
and Email
columns.
CustomerID | Name | |
---|---|---|
1 | John Smith | john.smith@example.com |
2 | John Smith | john.smith@example.com |
3 | Jane Doe | jane.doe@example.com |
4 | Jane Doe | jane.doe@example.com |
WITH duplicates AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY Name, Email ORDER BY CustomerID) AS row_num FROM Customers ) DELETE FROM duplicates WHERE row_num > 1;
After executing this query, the duplicate records will be deleted, leaving only one record for each unique combination of Name
and Email
.
Method 2: Using Self-Join
This method uses a self-join to identify duplicate records and then delete them.
DELETE a FROM your_table a INNER JOIN ( SELECT column1, column2, ... FROM your_table GROUP BY column1, column2, ... HAVING COUNT(*) > 1 ) b ON a.column1 = b.column1 AND a.column2 = b.column2 AND ... WHERE a.column1 < b.column1 OR (a.column1 = b.column1 AND a.column2 < b.column2);
Example:
Using the same Customers
table as before, we can delete duplicate records using a self-join.
DELETE a FROM Customers a INNER JOIN ( SELECT Name, Email FROM Customers GROUP BY Name, Email HAVING COUNT(*) > 1 ) b ON a.Name = b.Name AND a.Email = b.Email WHERE a.CustomerID < b.CustomerID;
After executing this query, the duplicate records will be deleted, leaving only one record for each unique combination of Name
and Email
.