SQL Server Queries - Delete duplicate rows

Deleting Duplicate Rows in SQL Server

In SQL Server, deleting duplicate rows can be a challenging task, especially when dealing with large datasets. Here are a few methods to delete duplicate rows, along with examples.

Method 1: Using ROW_NUMBER() Function

This method uses the ROW_NUMBER() function to assign a unique number to each row within a partition of the result set. We can then delete the duplicate rows based on the row number.

WITH duplicates AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2,... ORDER BY column1) AS row_num FROM your_table ) DELETE FROM duplicates WHERE row_num > 1;

Example:

Let's say we have a table called Orders with duplicate rows.

OrderID	CustomerID	OrderDate
1	1	2022-01-01
2	1	2022-01-01
3	2	2022-01-05
4	2	2022-01-05

WITH duplicates AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY CustomerID, OrderDate ORDER BY OrderID) AS row_num FROM Orders ) DELETE FROM duplicates WHERE row_num > 1;

After executing this query, the duplicate rows will be deleted, leaving only one row for each unique combination of CustomerID and OrderDate.

Method 2: Using Common Table Expression (CTE)

This method uses a Common Table Expression (CTE) to identify duplicate rows and then delete them.

WITH duplicates AS ( SELECT column1, column2,..., COUNT(*) OVER (PARTITION BY column1, column2,...) AS count FROM your_table ) DELETE FROM duplicates WHERE count > 1;

Example:

Using the same Orders table as before, we can delete duplicate rows using a CTE.

WITH duplicates AS ( SELECT CustomerID, OrderDate, COUNT(*) OVER (PARTITION BY CustomerID, OrderDate) AS count FROM Orders ) DELETE FROM duplicates WHERE count > 1;

After executing this query, the duplicate rows will be deleted, leaving only one row for each unique combination of CustomerID and OrderDate.

Method 3: Using Self-Join

This method uses a self-join to identify duplicate rows and then delete them.

DELETE a FROM your_table a INNER JOIN ( SELECT column1, column2,... FROM your_table GROUP BY column1, column2,... HAVING COUNT(*) > 1 ) b ON a.column1 = b.column1 AND a.column2 = b.column2 AND... WHERE a.column1 < b.column1 OR (a.column1 = b.column1 AND a.column2 < b.column2);

Example:

Using the same Orders table as before, we can delete duplicate rows using a self-join.

DELETE a FROM Orders a INNER JOIN ( SELECT CustomerID, OrderDate FROM Orders GROUP BY CustomerID, OrderDate HAVING COUNT(*) > 1 ) b ON a.CustomerID = b.CustomerID AND a.OrderDate = b.OrderDate WHERE a.OrderID < b.OrderID;

After executing this query, the duplicate rows will be deleted, leaving only one row for each unique combination of CustomerID

SQL Server Queries - Delete duplicate rows

Deleting Duplicate Rows in SQL Server

Method 1: Using ROW_NUMBER() Function

Example:

Method 2: Using Common Table Expression (CTE)

Example:

Method 3: Using Self-Join

Example:

You may also like

SQL Server Queries - Select N percent of random numbers

SQL Server Queries - Check Language

SQL Server Queries - Get Current Month Records

SQL Server Queries - Reset identity seed after deleting records in SQL Server

SQL Server Queries - Get Name of Day

SQL Server Queries - CASE Expression

Popular Post

SQL Server Queries - Delete duplicate rows

Deleting Duplicate Rows in SQL Server

Method 1: Using ROW_NUMBER() Function

Example:

Method 2: Using Common Table Expression (CTE)

Example:

Method 3: Using Self-Join

Example:

You may also like

SQL Server Queries - Select N percent of random numbers

SQL Server Queries - Check Language

SQL Server Queries - Get Current Month Records

SQL Server Queries - Reset identity seed after deleting records in SQL Server

SQL Server Queries - Get Name of Day

SQL Server Queries - CASE Expression

Categories

Popular Post