Deleting Duplicate Records Using CTE

Introduction

In this article, we will understand how to delete duplicate rows and records from a table using Common Table Expression (CTE) in SQL Server.

Let’s begin.

Firstly, we need to create a sample table and see how CTE works on that.

Now, let’s create an Employee table and see how it works.

CREATE TABLE DBO.Employees
(
  Employee_ID INT NOT NULL PRIMARY KEY,
  First_Name VARCHAR (50) NOT NULL,
  Last_Name VARCHAR (50) NOT NULL,
  Manager_ID INT NULL
)
GO

INSERT INTO Employees VALUES (1, 'Ray', 'Menon', NULL)
INSERT INTO Employees VALUES (2, 'Reshu', 'Dutt', 4)
INSERT INTO Employees VALUES (3, 'Ansh', 'Raj', 1)
INSERT INTO Employees VALUES (4, 'Boby', 'Jain', 2)
INSERT INTO Employees VALUES (5, 'Gargi', 'Jain', 2)
INSERT INTO Employees VALUES (6, 'Aman', 'Mittal', 3)
INSERT INTO Employees VALUES (7, 'Ansh', 'Raj', 1)
INSERT INTO Employees VALUES (8, 'Dia', 'Mehra', 5)
INSERT INTO Employees VALUES (9, 'Reshu', 'Dutt', 4)
INSERT INTO Employees VALUES (10, 'Ansh', 'Raj', 1)
GO

Now let’s query the table

SELECT * FROM DBO.Employees;

SELECT

In the above screenshot, we can see the Employees table is created and can see many duplicate records highlighted in blue and red.

Now we will create CTE to delete duplicate records from the Employees table.

WITH CTE AS
(
	SELECT ROW_NUMBER() OVER (PARTITION BY First_Name, Last_Name, Manager_ID ORDER BY Employee_ID) RowNumber
	FROM DBO.Employees
)
DELETE FROM CTE WHERE RowNumber > 1

CTE

We can see this in the above result.

  • Inside the CTE, we used the ROW_NUMBER () function, which assigns unique RowNumber in sequential order and sorts the result based on column Employee_ID.
  • PARTITION BY clause divides the rows based on columns First_Name, Last_Name, and Manager_ID and assigns the Rownumber to each row within its partition, and deletes duplicate records by putting a condition on CTE if RowNumber is more than one.

Now will see on below result set if all the duplicate records are deleted from a table.

Delete from table

As you can see above, duplicate records are gone, and we can see all unique records.

This was the easy and efficient way to clear duplicates.

Summary

As you can see, CTE is a very useful feature in SQL Server. It simplifies the code and deletes duplicate records in one go. I hope you find the article useful. Please share your feedback in the comments section.


Similar Articles