SQL Server tables should never contain duplicate rows, if there is a key
constraints. Table with no key constraints allow the duplicates. It happens in
many ways such as import data from other resources.
Nothing to worry, write a query and make your database without duplicates.
First we need a table and some data to explain the process. Use the following
script to create a table and insert a record.
CREATE TABLE Dup_Students (ID INT, FirstName varchar(25), Department Char(2))
CREATE TABLE Dup_Students (ID
INT, FirstName
varchar(25),
Department Char(2))
INSERT INTO Dup_Students
VALUES(1, 'Jack', 'IT')
INSERT INTO Dup_Students
VALUES(2, 'Alice', 'ME')
INSERT INTO Dup_Students
VALUES(3, 'James', 'EE')
INSERT INTO Dup_Students
VALUES(4, 'Nickle', 'CE')
INSERT INTO Dup_Students
VALUES(5, 'George', 'IT')
--
INSERT INTO Dup_Students
VALUES(1, 'Jack', 'IT')
INSERT INTO Dup_Students
VALUES(2, 'Alice', 'ME')
INSERT INTO Dup_Students
VALUES(3, 'James', 'EE')
INSERT INTO Dup_Students
VALUES(4, 'Nickle', 'CE')
--
INSERT INTO Dup_Students
VALUES(1, 'Jack', 'IT')
INSERT INTO Dup_Students
VALUES(2, 'Alice', 'ME')
Now you can view the number of duplicate records in each row.
SELECT
ID, FirstName,
Department, Count(*) as DuplicateCount
From
Dup_Students
group by ID,FirstName,
Department
Before deleting you can view the records with row number for the duplicates.
This result set is going to act as a temp table for the delete process.
SELECT
ID, FirstName,
Department,
ROW_NUMBER() OVER(PARTITION BY ID, FirstName
ORDER BY ID) AS DuplicateRowCount
FROM
Dup_Students
We are going to delete all the duplicates using CTE (Common table expression)
and ROW_NUMBER(), which is a new in SQL server 2005.
WITH
Dup_Students_CTE (ID,
FirstName, Department,
DuplicateRowCount)
AS
(
SELECT
ID, FirstName,
Department,
ROW_NUMBER() OVER(PARTITION BY ID, FirstName
ORDER BY ID) AS DuplicateRowCount
FROM
Dup_Students
)
DELETE FROM Dup_Students_CTE
WHERE
DuplicateRowCount > 1
GO