Reader Level:
Article

Remove Duplicate Records in SQL Server 2005

By Srividhya Sridhar on Jul 01, 2010
In this article you will learn how to Remove duplicate records in SQL Server 2005.

SQL Server tables should never contain duplicate rows, if there is a key constraints. Table with no key constraints allow the duplicates. It happens in many ways such as import data from other resources.

Nothing to worry, write a query and make your database without duplicates.

First we need a table and some data to explain the process. Use the following script to create a table and insert a record.

CREATE TABLE Dup_Students (ID INT, FirstName varchar(25), Department Char(2))

CREATE TABLE Dup_Students (ID INT, FirstName varchar(25), Department Char(2))
 
INSERT INTO Dup_Students VALUES(1, 'Jack', 'IT')
INSERT INTO Dup_Students VALUES(2, 'Alice', 'ME')
INSERT INTO Dup_Students VALUES(3, 'James', 'EE')
INSERT INTO Dup_Students VALUES(4, 'Nickle', 'CE')
INSERT INTO Dup_Students VALUES(5, 'George', 'IT')
--
INSERT INTO Dup_Students VALUES(1, 'Jack', 'IT')
INSERT INTO Dup_Students VALUES(2, 'Alice', 'ME')
INSERT INTO Dup_Students VALUES(3, 'James', 'EE')
INSERT INTO Dup_Students VALUES(4, 'Nickle', 'CE')

--
INSERT INTO Dup_Students VALUES(1, 'Jack', 'IT')
INSERT INTO Dup_Students VALUES(2, 'Alice', 'ME')

Now you can view the number of duplicate records in each row.

SELECT ID, FirstName, Department, Count(*) as DuplicateCount
From Dup_Students
group by ID,FirstName, Department

Before deleting you can view the records with row number for the duplicates. This result set is going to act as a temp table for the delete process.

SELECT ID, FirstName, Department,
ROW_NUMBER() OVER(PARTITION BY ID, FirstName ORDER BY ID) AS DuplicateRowCount
FROM Dup_Students

We are going to delete all the duplicates using CTE (Common table expression) and ROW_NUMBER(), which is a new in SQL server 2005.

WITH Dup_Students_CTE (ID, FirstName, Department, DuplicateRowCount)
AS
(
SELECT ID, FirstName, Department,
ROW_NUMBER() OVER(PARTITION BY ID, FirstName ORDER BY ID) AS DuplicateRowCount
FROM Dup_Students
)
DELETE FROM Dup_Students_CTE
WHERE DuplicateRowCount > 1
GO

COMMENT USING