Remove Duplicate Rows In SQL Server Using CTE

In this article, you will learn how to remove duplicate rows in SQL Server using CTE. Database contains duplicate rows which can be filtered out using certain columns. This can happen for many reasons. When we want to show this data in our application, we need to manage these kinds of duplicated rows. So here, I am showing a simple way to remove duplicate rows.

What is CTE?

CTE stands for Common Table Expressions.

We define CTEs by adding a "WITH" clause directly before our SELECT, INSERT, UPDATE, DELETE, or MERGE statement. The WITH clause can include one or more CTEs, as shown in the following syntax.

  1. [WITH <common_table_expression> [,...]]    
  2.    
  3. <common_table_expression>::=  
  4. cte_name [(column_name [,...])]  
  5. AS (cte_query)  

This can be represented like this…

SQL Server

So now, we have come to our main point, i.e., how to remove duplicated rows. Suppose, we have a table called ChecklistVersion which holds ChecklistID and ChecklistVersionID. The table may contain records like this.

ChecklistIdChecklistVersionID
111
212
313
111
111
313

Now, we want to select only one distinct pair. Let's see how we can do that.

Take a look at the SQL code below.

  1. With DataCte  as  
  2. (select *, RANK( )   
  3. over(partition By ChecklistId, ChecklistVersionId order by ChecklistId) as rnk from ChecklitVersion )  
  4. select * from DataCte   

In the above SQL, DataCte is the CTE expression which acts as a temporary View. In the query definition, we are using RANK function and partitioning the table rows with ChecklistID and ChecklistVersionID to assign an occurrence number to each pair. The query will return a result as below.

ChecklistIDChecklistVersionIDrnk
1111
1112
1113
2121
3131
3132

Now, instead of the final SELECT query, we can delete the rows from our temporary result set which has rnk > 1.

  1. With DataCte  as  
  2. (select *, RANK( )   
  3. over(partition By ChecklistId, ChecklistVersionId order by ChecklistId) as rnk from ChecklitVersion )  
  4. select * from DataCte where rnk =1  

Result of the above query.

ChecklistIDChecklistVersionIDrnk
1111
2121
3131

This will just select only one occurrence of each ChecklistID and ChecklistVersionID pair from the ChecklistVersion table.