How to Remove Duplicate Records in SQL?
Abhishek Yadav
Select an image from your device to upload
Solution:
WITH cte AS (SELECT *,ROW_NUMBER() OVER (PARTITION BY StudentName, Subject, MarksORDER BY Id) AS rnFROM StudentExams ) DELETE FROM cte WHERE rn > 1;
Explanation:
PARTITION BY groups similar rows.
PARTITION BY
ROW_NUMBER() assigns a sequence number to each row within the group.
ROW_NUMBER()
Rows with rn > 1 are duplicates and are removed.
rn > 1
To remove duplicate records in SQL, you can use a query with functions like ROW_NUMBER() to identify duplicate rows based on specific columns, then delete the extra records while keeping one unique entry. This is commonly done using a Common Table Expression (CTE) that assigns a row number to each record within a group of duplicates, allowing you to safely remove the unwanted rows without losing valid data.
Removing duplicate records in SQL depends on how you define a duplicate and which database you’re using (SQL Server, MySQL, Oracle, PostgreSQL). Below are the most common and practical approaches, explained step-by-step.
1️⃣ Identify Duplicate Records (Best Practice – First Step)
Employee(Id, Name, Email)
SELECT Email, COUNT(*) AS Total FROM Employee GROUP BY Email HAVING COUNT(*) > 1;
✅ Works in SQL Server, MySQL 8+, PostgreSQL, Oracle
WITH CTE AS (SELECT *,ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Id DESC) AS rnFROM Employee ) DELETE FROM CTE WHERE rn > 1;
✔ Keeps one record per Email✔ Deletes all extra duplicates
DELETE e1 FROM Employee e1 JOIN Employee e2 ON e1.Email = e2.Email AND e1.Id > e2.Id;
✔ Keeps record with lowest Id
DISTINCT
⚠ Use when table has no primary key
CREATE TABLE Employee_New ASSELECT DISTINCT *FROM Employee;DROP TABLE Employee;ALTER TABLE Employee_New RENAME TO Employee;
GROUP BY
DELETE FROM Employee WHERE Id NOT IN (SELECT MIN(Id)FROM EmployeeGROUP BY Email );
✔ Keeps one row per Email ❌ Slower for large tables
ALTER TABLE Employee ADD CONSTRAINT UQ_Employee_Email UNIQUE (Email);
Now duplicates can’t be inserted again 🚀
SELECT *INTO #TempEmployee FROM (SELECT *,ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Id) rnFROM Employee ) t WHERE rn = 1;TRUNCATE TABLE Employee;INSERT INTO Employee SELECT * FROM #TempEmployee;
UNIQUE constraint