Use Of Union, Intersect, Distinct And Except In LINQ

Introduction

Here, you will learn the uses of Union, Intersect, Except, and their differences. To get a clear picture, we will  use examples.

Prerequisites
  • Visual Studio
  • Basics of LINQ and C# 
Article flow 
  • Use of Union
  • Use of Intersect
  • Use of Except
  • Use of Distinct
  • Difference between Union, Intersect, Distinct, and Except
Use of Union

Union is an extension method to merge two collections. It requires at least two collections to perform the merge operation, but that merged collection holds only the distinct elements from both the collections. For better understanding, we will use an example.

First, we will go with a simple example, for that I created two lists
  1. List<int> collectionOne = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };  
  2. List<int> collectionTwo = new List<int>() { 1, 2, 3, 11, 12, 13, 14, 15 };  
  3. var unionResult = collectionOne.Union(collectionTwo);  
  4. foreach (var item in unionResult)  
  5. {  
  6.    Console.Write(item +" ");  
  7. }  
Now run your application

 

Detailed Description

In above result you can see the elements are unique because union removed the repeated/duplicate elements from both collections/ list. Now we got a clear picture about union, the union is used to return the distinct elements from both collections. Now we will use a complex example, and for that I have created a class below and made two list with values
  1. public class Employee {  
  2.     public int ID {  
  3.         get;  
  4.         set;  
  5.     }  
  6.     public string Name {  
  7.         get;  
  8.         set;  
  9.     }  
  10.     public string Designation {  
  11.         get;  
  12.         set;  
  13.     }  
  14.     public double Salary {  
  15.         get;  
  16.         set;  
  17.     }  
  18. }  
  19. List < Employee > employeeCollection1 = new List < Employee > () {  
  20.     new Employee() {  
  21.             ID = 1, Name = "Gnanavel Sekar", Designation = "Software Engineer", Salary = 150000  
  22.         },  
  23.         new Employee() {  
  24.             ID = 2, Name = "Subash S", Designation = "Software Engineer", Salary = 170000  
  25.         },  
  26.         new Employee() {  
  27.             ID = 3, Name = "Robert A", Designation = "Application Developer", Salary = 180000  
  28.         },  
  29.         new Employee() {  
  30.             ID = 4, Name = "Ammaiyappan", Designation = "Software Developer", Salary = 120000  
  31.         }  
  32. };  
  33. List < Employee > employeeCollection2 = new List < Employee > () {  
  34.     new Employee() {  
  35.             ID = 1, Name = "Gnanavel Sekar", Designation = "Software Engineer", Salary = 150000  
  36.         },  
  37.         new Employee() {  
  38.             ID = 3, Name = "Robert A", Designation = "Application Developer", Salary = 180000  
  39.         },  
  40.         new Employee() {  
  41.             ID = 4, Name = "Ammaiyappan", Designation = "Software Developer", Salary = 120000  
  42.         },  
  43.         new Employee() {  
  44.             ID = 5, Name = "Ramar A", Designation = "Tech Lead", Salary = 200000  
  45.         }  
  46. };  
  47. var unionComplexResult = employeeCollection1.Union(employeeCollection2);  
  48. Console.WriteLine(System.Environment.NewLine + "Complex Union");  
  49. foreach(var item in unionComplexResult) {  
  50.     Console.WriteLine("ID :" + item.ID + " Name :" + item.Name + " Designation :" + item.Designation + " Salary :" + item.Salary);  
  51. }  
Now run your application to see the result

 

In the above result you can see, there are the duplicate values, the union extend method doesn't perform its unique operation with large collections because it doesn't know which field to compare, so we need to specify which field to compare. To overcome this problem "IEqualityComparer" interface

Union with IEqualityComparer

When we work with IEqualityComparer we need to specify the field to compare. For that I have created on class which implements the iequalitycomparer interface and overrides the equals method
  1. public class EmployeeComparer: IEqualityComparer < Employee > {  
  2.     public bool Equals(Employee emp1, Employee emp2) {  
  3.         if (emp1.ID == emp2.ID && emp1.Name.ToString().ToUpper() == emp2.Name.ToString().ToUpper()) {  
  4.             return true;  
  5.         }  
  6.         return false;  
  7.     }  
  8. }  
In above class we mentioned to compare the Employee ID and Name from those collections 
 
Now we need mention the class in union extension method which implements IEqualityComparer interface, here we implmented in EmployeeComparer class,

 
  1. var unionComplexResultWithIEquality = employeeCollection1.Union(employeeCollection2, new EmployeeComparer());  
  2. foreach(var item in unionComplexResultWithIEquality)  
  3. {  
  4.     Console.WriteLine("ID :" + item.ID + " Name :" + item.Name + " Designation :" + item.Designation + " Salary :" + item.Salary);  
  5. }  
Now see the result with Iequalitycomparer. See the below screen the results are with the distinct elements from those collections.



Now we got a clear picture about Union operator in linq, it's used to merge the two collections and returns distinct elements from those collection sand we need to implment the IEqualitycomparer while we work with huge amounts of data.

Use of Intersect

Intersect Method is a extension to method and it requires two collections to return common pattern/element from both the collection. First we will go with simple example to get idea about this. For that I created two integer collections.
  1. List<int> collectionOne = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };  
  2. List<int> collectionTwo = new List<int>() { 1, 2, 3, 11, 12, 13, 14, 15 };  
  3. Console.WriteLine(System.Environment.NewLine + "Intersect");  
  4. var IntersectResult = collectionOne.Intersect(collectionTwo);  
  5. foreach (var item in IntersectResult)  
  6. {  
  7.    Console.WriteLine(item);  
  8. }  
Result

 

In the above result you can see that common values are only returned from those collections, for better understanding shall we move with large collection values?, for that I have created employee class and I made two lists with values.
  1. List < Employee > employeeCollection1 = new List < Employee > () {  
  2.     new Employee() {  
  3.             ID = 1, Name = "Gnanavel Sekar", Designation = "Software Engineer", Salary = 150000  
  4.         },  
  5.         new Employee() {  
  6.             ID = 2, Name = "Subash S", Designation = "Software Engineer", Salary = 170000  
  7.         },  
  8.         new Employee() {  
  9.             ID = 3, Name = "Robert A", Designation = "Application Developer", Salary = 180000  
  10.         },  
  11.         new Employee() {  
  12.             ID = 4, Name = "Ammaiyappan", Designation = "Software Developer", Salary = 120000  
  13.         }  
  14. };  
  15. List < Employee > employeeCollection2 = new List < Employee > () {  
  16.     new Employee() {  
  17.             ID = 1, Name = "Gnanavel Sekar", Designation = "Software Engineer", Salary = 150000  
  18.         },  
  19.         new Employee() {  
  20.             ID = 3, Name = "Robert A", Designation = "Application Developer", Salary = 180000  
  21.         },  
  22.         new Employee() {  
  23.             ID = 4, Name = "Ammaiyappan", Designation = "Software Developer", Salary = 120000  
  24.         },  
  25.         new Employee() {  
  26.             ID = 5, Name = "Ramar A", Designation = "Tech Lead", Salary = 200000  
  27.         }  
  28. };  
  29. var intersectComplexResult = employeeCollection1.Intersect(employeeCollection2);  
  30. Console.WriteLine(System.Environment.NewLine + "Complex Intersect");  
  31. foreach(var item in intersectComplexResult) {  
  32.     Console.WriteLine("ID :" + item.ID + " Name :" + item.Name + " Designation :" + item.Designation + " Salary :" + item.Salary);  
  33. }  
Now see the result,

 

It doesn't return any value, why? Because it doesn't know howto compare, there is no issue while we have a single column value in the collection but here we have more than one column so that it can't compare, so shall we specify the fields to compare? We can overcome this issue by using IEqualityComparer. Okay we will use an example

Intersect with IEqualityComparer 
  1. public class EmployeeComparer: IEqualityComparer < Employee > {  
  2.     public bool Equals(Employee emp1, Employee emp2) {  
  3.         if (emp1.ID == emp2.ID && emp1.Name.ToString().ToUpper() == emp2.Name.ToString().ToUpper()) {  
  4.             return true;  
  5.         }  
  6.         return false;  
  7.     }  
  8.     public int GetHashCode(Employee obj) {  
  9.         return obj.ID.GetHashCode();  
  10.     }  
  11. }  
In above "EmployeeComparer" class implmented the IEqualityComparer Interface and here we said to compare the employee ID and Name
  1. var intersectEqualityComplexResult = employeeCollection1.Intersect(employeeCollection2,new EmployeeComparer());  
  2. Console.WriteLine(System.Environment.NewLine + "Complex Intersect with Equality");  
  3. foreach (var item in intersectEqualityComplexResult)  
  4. {  
  5.    Console.WriteLine("ID :" + item.ID + " Name :" + item.Name + " Designation :" + item.Designation + " Salary :" + item.Salary);  
  6. }  
Result

 

in above result we got the common pattern from those collection by using IEqualitycomparer, Now we got the clear picture about Intersect, it's used to return the common pattern from both collections and we need to use the IEqualitycomparer while we working with large collections.

Use of Except 

Except method is a extension method to return a first collection which doesn't exist in second collection, it also requires two collections to perform its unique actions, let's see the example
  1. List<int> collectionOne = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };  
  2. List<int> collectionTwo = new List<int>() { 1, 2, 3, 11, 12, 13, 14, 15 };  
  3. Console.WriteLine(System.Environment.NewLine + "Except");  
  4. var exceptResult = collectionOne.Except(collectionTwo);  
  5. foreach (var item in exceptResult)  
  6. {  
  7.    Console.WriteLine(item + " ");  
  8. }  
Result

 

See the result: We got the first collection value which does not exist in the second collection. Now we'll go with large collection values
  1. List < Employee > employeeCollection1 = new List < Employee > () {  
  2.     new Employee() {  
  3.             ID = 1, Name = "Gnanavel Sekar", Designation = "Software Engineer", Salary = 150000  
  4.         },  
  5.         new Employee() {  
  6.             ID = 2, Name = "Subash S", Designation = "Software Engineer", Salary = 170000  
  7.         },  
  8.         new Employee() {  
  9.             ID = 3, Name = "Robert A", Designation = "Application Developer", Salary = 180000  
  10.         },  
  11.         new Employee() {  
  12.             ID = 4, Name = "Ammaiyappan", Designation = "Software Developer", Salary = 120000  
  13.         }  
  14. };  
  15. List < Employee > employeeCollection2 = new List < Employee > () {  
  16.     new Employee() {  
  17.             ID = 1, Name = "Gnanavel Sekar", Designation = "Software Engineer", Salary = 150000  
  18.         },  
  19.         new Employee() {  
  20.             ID = 3, Name = "Robert A", Designation = "Application Developer", Salary = 180000  
  21.         },  
  22.         new Employee() {  
  23.             ID = 4, Name = "Ammaiyappan", Designation = "Software Developer", Salary = 120000  
  24.         },  
  25.         new Employee() {  
  26.             ID = 5, Name = "Ramar A", Designation = "Tech Lead", Salary = 200000  
  27.         }  
  28. };  
  29. var ExceptComplexResult = employeeCollection1.Except(employeeCollection2);  
  30. Console.WriteLine(System.Environment.NewLine + "Complex Except");  
  31. foreach(var item in ExceptComplexResult) {  
  32.     Console.WriteLine("ID :" + item.ID + " Name :" + item.Name + " Designation :" + item.Designation + " Salary :" + item.Salary);  
  33. }  
Result

 

See the above result. We got the wrong result because it cannot handle the large collections and it doesn't know which field to compare. To overcome this issue we need to go with IEqualityComparer. For that I have created the EmployeeComparer class, which implements the IEqualityComparer interface to compare exact fields.

Except with IEqualityComparer
  1. public class EmployeeComparer: IEqualityComparer < Employee > {  
  2.     public bool Equals(Employee emp1, Employee emp2) {  
  3.         if (emp1.ID == emp2.ID && emp1.Name.ToString().ToUpper() == emp2.Name.ToString().ToUpper()) {  
  4.             return true;  
  5.         }  
  6.         return false;  
  7.     }  
  8.     public int GetHashCode(Employee obj) {  
  9.         return obj.ID.GetHashCode();  
  10.     }  
  11. }  
We need to access the EmployeeComparer class in the following way

  1. //IEqualityComparer  
  2. var ExceptComplexResultWithIEquality = employeeCollection1.Except(employeeCollection2, new EmployeeComparer());  
  3. foreach (var item in ExceptComplexResultWithIEquality)  
  4. {  
  5.    Console.WriteLine("ID :" + item.ID + " Name :" + item.Name + " Designation :" + item.Designation + " Salary :" + item.Salary);  
  6. }  
Result

 

Now you see the exact result, it returned the first collection value which doesn't exist in the second collection

Use of Distinct 

This method is used to return the unique element(s) from the respective collection. For better understanding I have created two lists with string and integer types.
  1. List<string> nameList = new List<string>() { "Gnanavel Sekar""Subash S""Robert""Gnanavel Sekar""Subash S" };  
  2. List<int> idList = new List<int>() { 1, 2, 3, 2, 4, 4, 3, 5 };  
  3. var stringCollection = nameList.Distinct();  
  4. foreach (var str in stringCollection)  
  5. {  
  6.    Console.WriteLine(str);  
  7. }  
  8. var integerCollection = idList.Distinct();  
  9. foreach (var i in integerCollection)  
  10. {  
  11.    Console.WriteLine(i);  
  12. }  
Now run your application

 

Now we got the clear picture about Distinct, it's used to return the unique value from the respective collection

Difference between Union,Intersect, Except and Distinct

 Union  Intersect Except Distinct
 It returns the unique (or) distinct element from both collection. It returns the common element from both collection It returns the fist collection element which doesn't exist in the second collection It returns the distinct element from the respective collection
It requires two collections It requires two collections It requires two collections It requires single collection

Summary

In this article you learned the uses of Union, Intersect, Except, Distinct and the respective differences.  I hope it was helpful, your comments and feedback are always welcome.


Similar Articles