Delete Duplicate Rows In SQL Server From A Table

Introduction

 
In this article, I am going to explain how to delete duplicate rows/records in SQL server using common table expression (CTE). This is one of the most common questions asked in a SQL interview.
 
Here we will be using SQL Server 2017 or you can use SQL Server 2008 or above.
 
Read my previous Joins in SQL Server 2017 part of this article using the below links,
Prerequisites
 
SQL Server 2017 or you can use SQL server 2008 or above version.
 
Now, first we will create a Database and a table.
 

Creating a Database and a Table

 
Step 1: Create a Database
 
Open your SQL Server and use the following script to create the “chittadb” Database.
 
Create database chittadb
 
Now, select the script query then press F5 or click on Execute button to execute the above script.
 
You should see a message, “Command(s) completed successfully.” This means your new database has been created.
 
Step 2: Create a table
 
Open your SQL Server and use the following script to create table “tbl_Mcastudents”.
  1. create table tbl_Mcastudents   
  2. (  
  3.    Id int primary key not null identity(1,1),  
  4.    Name nvarchar(50),  
  5.    Location nvarchar(30),  
  6.    Gender varchar(10)  
  7. )  
Execute the above query to create “tbl_Mcastudents “.
 
You should see a message, “Command(s) completed successfully.”
 
Now, data has been  inserted into the table.
  1. Insert into tbl_Mcastudents values ('Chitta''Chennai''Male')  
  2. Insert into tbl_Mcastudents values ('Chitta''Chennai''Male')  
  3. Insert into tbl_Mcastudents values ('Chitta''Chennai''Male')  
  4. Insert into tbl_Mcastudents values ('Rani''Puri''Female')  
  5. Insert into tbl_Mcastudents values ('Rani''Puri''Female')  
  6. Insert into tbl_Mcastudents values ('Mitu''BBSR''Male')  
  7. Insert into tbl_Mcastudents values ('Mitu''BBSR''Male')  
  8. Insert into tbl_Mcastudents values ('Mitu''BBSR''Male')   
Execute the above query, you should see a message, “Command(s) completed successfully.”
 
Now retrieve all data from “tbl_Mcastudents” table.
  1. select * from tbl_Mcastudents   
Output
 
Delete Duplicate Rows In SQL Server From A Table
 
There are many duplicate rows (10, 11, 12), (13, 14), and (15, 16, 17) for the tbl_Mcastudents that have the same Name, Location, and Gender.
 

Delete duplicate rows/records in SQL server using common table expression (CTE)

 
To delete the duplicate rows from the table in SQL Server, we follow these steps,
  • Find duplicate rows using GROUP BY clause or ROW_NUMBER()
  • Use DELETE statement to remove the duplicate rows.
Query
  1. WITH cte AS  
  2. SELECT Id, Name, Location, Gender, ROW_NUMBER() OVER (PARTITION BY Name, Location, Gender  
  3. ORDER BY Name, Location, Gender) row_num FROM tbl_Mcastudents  
  4. )  
  5. DELETE FROM cte WHERE row_num > 1;  
In above query,
  • First, the CTE uses the ROW_NUMBER() function to find the duplicate rows specified by values in the Name, Location, and Gender
  • Then, the DELETEstatement deletes all the duplicate rows but keeps only one occurrence of each duplicate group.
To execute the above query you should see a message,
 
(5 rows affected) 
 
The above message indicates that the duplicate rows have been removed from the table.
 
Now retrieve all data from “tbl_Mcastudents” table after the duplicate rows have been deleted.
  1. select * from tbl_Mcastudents  
Output
 
Delete Duplicate Rows In SQL Server From A Table
 

Conclusion

 
In this article, we have learned how to delete duplicate rows from a table in SQL Server.