Story Of Equality In .NET - Part Six

Background

This article is in continuation of the series of articles regarding how equality works in .NET; the purpose is to give developers a more clear understanding on how .NET handles equality for different types.

What we learned so far

Following are the key points that we learned from the previous parts so far:

  • C# does not syntactically distinguish between the value and reference equality, which means it can sometimes be difficult to predict what the equality operator will do in particular situations.
  • There are often multiple different ways of legitimately comparing the values. .NET addresses this by allowing the types to specify their preferred natural way to compare for an equality, also providing a mechanism to write an equality, which allows you to place a default equality for each type.
  • It is not recommended to test floating point values for an equality because rounding errors can make this unreliable.
  • There is an inherent conflict between implementing equality, type-safety and good Object Oriented practices.
  • .NET provides the types of an equality implementation out of the box,a  few methods are defined by .NET framework on the Object class, which are available for all the types.
  • By default, the virtual Object.Equals method does reference equality for the reference types and the value equality for the value types, but for the value types, it uses reflection, which is a performance overhead for the value types and any type can override Object.Equals method to change the logic of how it checks for the equality; e.g., String, Delegate and Tuple do this for providing value equality, even though these are reference types.
  • Object class also provides a static Equals method, which can be used when there is a chance that one or both of the parameters can be null, other than that it behaves identical to the virtual Object.Equals method.
  • There is also a static ReferenceEquals method, which provides a guaranteed way to check for the reference equality.
  • IEquatable<T> interface can be implemented on a type to provide a strongly typed Equals method, which also avoids boxing for the value types. It is implemented for the primitive numeric types but unfortunately Microsoft has not been very proactive implementing for other value types in the FCL( Framework Class Library )
  • For Value Types, using == operator gives us the same result as calling Object.Equals but underlying mechanism of == operator is different in IL (Intermediate Language ) as compared to Object.Equals, so the Object.Equals implementation provided for that primitive type is not called, instead an IL instruction ceq gets called, which says to compare the two values which are being loaded on the stack right now and perform equality comparison, using CPU registers.
  • For Reference Types == operator and Object.Equals method call, both work differently behind the scenes, which can be verified by inspecting the IL code generated. It also uses ceq instruction, which do the comparison of the memory addresses.

If you want to read the other parts published so far, you can read them here-

Equality Operator for String

We will be looking at String type in this post and how Equality works for it. You might be aware that for strings, the equality operator compares the values and not the references, which we had seen in the first post of this series. It is because String has overridden the implementation of Equals to behave in this manner.

We will investigate how == operator and Object.Equals method behaves for an equality checking.

Consider the piece of code, given below-

  1. class Program  
  2. {  
  3.     static void Main(String[] args) {  
  4.         string s1 = "Ehsan Sajjad";  
  5.         string s2 = String.Copy(s1);  
  6.         Console.WriteLine(ReferenceEquals(s1, s2));  
  7.         Console.WriteLine(s1 == s2);  
  8.         Console.WriteLine(s1.Equals(s2));  
  9.         Console.ReadKey();  
  10.     }  
  11. }  
The code, given above, is very similar to what we have looked at before as well, but this time, we have String type variables in place. We are creating a string and holding its reference in s1 variable and on the next line, we are creating a copy of the string and holding its reference in another variable named as s2.

We are checking for the reference equality for both the variables that whether they are both pointing to the same memory location or not, then in next two lines, we are checking the output of an equality operator and Object.Equals method.

Now, we will build the project and run it to see what it outputs on the console. The following is the output printed on the console-



You can see that ReferenceEquals has returned false, which means that both the strings are different instances, but the equality operator and Equals method have returned true, so it is clear that for the strings, the equality operator tests the value for an equality and not the reference exactly as Object.Equals does.

Behind the Scenes of Equality Operator for String

Let’s see, how the equality operator is doing that. Now, let’s examine the IL code generated for this example. For doing this, open the Visual Studio command prompt. To open it, go to Start Menu >> All Programs >> Microsoft Visual Studio >> Visual Studio Tools>> Developer Command Prompt.



Type ildasm on the command prompt. This will launch the ildasm, which is used to look at the IL code contained in an assembly. It is installed automatically, when you install Visual Studio, so you don’t need to do anything to install it.



Click File Menu to open the men and click the Open Menu Item, which will bring up the Window to browse the executable, which we want to disassemble.



Now, navigate to the location, where the exe file is located and open it.



This will bring up the code of the assembly in a hierarchical form, as we have multiple classes written in the assembly, so it has listed down all the classes.

Now, the code, which we want to explore is in the Main Method of the Program class, in order to navigate to the Main method and double click it to bring IL code for it.



IL code for main looks, as shown below-



IL for Equals Method

If we look at IL generated for s1.Equals(s2), there are no surprises, as it is calling Equals method but this time it is calling the method implementation of IEquatable<string>, which takes a string as an argument.



IL for == operator for String

Now, let’s examine what is the IL generated for the string equality checking that is done using equality operator. We can see there is no ceq instruction being called, which we saw in the previous posts that for the value types and reference types those instructions are executed when we check for an equality, using == operator, but for the string, we have to call to a new method named as op_equality(string, string), which takes two string arguments. We have not seen this kind of method before, so what is it actually?

The answer is it is the overload of C# equality operator (==). In C#, when we define a type, we have the option to overload the equality operator for this type. For example, Person class is visible in the previous examples. If we overload the == operator for it, it looks, as given below-
  1. public class Person  
  2. {  
  3.     public int Id {  
  4.         get;  
  5.         set;  
  6.     }  
  7.     public string Name {  
  8.         get;  
  9.         set;  
  10.     }  
  11.     public static bool operator == (Person p1, Person p2) {  
  12.         bool areEqual = false;  
  13.         if (p1 == null || p2 == null) areEqual = false;  
  14.         if (p1 == null && p2 == null) areEqual = true;  
  15.         if (p1.Id == p2.Id) areEqual = true;  
  16.         else areEqual = false;  
  17.         return areEqual;  
  18.     }  
  19. }  
The code given above is pretty simple. We have declared an operator overload, which would be a static method, but the thing to notice here is that the name of the method is operator == and the similarity of declaring an operator overload with the static method is not a co-incidence. Actually, it is compiled as a static method by the compiler, because we know and it had been discussed before IL (Intermediate Language) has no concept of operators, events etc. It only understands the fields and methods, so the operator overload can only exist as a method, which we observed in IL code, given above. The overload operator code is turned by the compiler in to a special static method called op_Equality().
First, it is checking, if any of the passed instances are null then they are not equal. We see, if both are null then obviously both the references are equal, so it will return true and next it checks, if Id property of both the references are equal, then they are equal, else they are not equal.

This way, we can define our own implementation for our custom types, according to the business requirements. As we discussed earlier, the equality of two objects is totally dependent on the business flow of the Application, so two objects might look equal to someone but not equal to someone else, according to their business logic.

This makes the thing clear that Microsoft has provided == operator overload for String class and we can even see that, if we peek into the source code of String class in Visual Studio, using Go to Definition, which would be like-



We can see that there are two operators overloading- one for equality and the other is inequality operator, which works exactly the same way but with the negation of an equality operator output.

Summary


  • We have now enough understanding of what C# Equality operator does in the case of Reference Types. Following are the things to be kept in mind-

    • If there is an overload for the equality operator for the type being compared, it uses that operator as a static method.
    • If there is no overload of an operator for the reference type, the equality operator compares the memory addresses, using ceq instruction.

  • One thing to note is that Microsoft made sure that == operator overload and Object.Equals override always gives the same result even though they are in fact different methods. This is an important thing, we need to keep in mind, when we start implementing our own Equals override, we should also take care of the equality operator as well, otherwise our type will end up giving a different result using Equals override and equality operator, which would be problematic for the consumers of the type. We will be seeing in another post, how we can override Equals method in a proper way.

  • If we are changing how equality works for a type, we need to make sure we provide an implementation for both Equals override and == operator overload, so that they both give the same result and it's obvious, else, it would be confusing for other developers, who will be using our developed type.