String a Reference Type Like a Value Type in C#

The data type Integer is a value type, but a String is a reference type. Why?

A String is a reference type even though it has most of the characteristics of a value type such as being immutable and having == overloaded to compare the text rather than making sure they reference the same object.

Why a string is just a value type

Why isn't a string just a value type, then? why

Let's take a deep dive into it.

First of all let's understand what value types and reference types are and the difference between them in a glance.

The types can be classified depending on whether a variable of a specific type stores its own data or a pointer to the data. That means, instead of storing data, it stores its address. If it stores its own data it is a value type and if it holds a pointer to data elsewhere in memory it is a reference type.

In the Microsoft implementation of C# on the desktop CLR, value types are stored on the stack when the value is a local variable or temporary that is not a closed-over local variable of a lambda or anonymous method and the method body is not an iterate block and the jitter chooses to not energize the value.

Value types are stored in the Stack whereas reference types are stored in the Heap.

Strings aren't value types since they can be huge and need to be stored on the heap. Value types are stored on the stack as in the CLR implementation by Microsoft. Stack allocating strings would break all sorts of things. The stack is only 1MB, in such case we need to box each string incurring a copy penalty, we couldn't intern strings and memory usage would balloon.

But one can provide the argument here that we could still define a string as a struct that internally points to a char[] array and we can create a type that behaves like a reference type but is technically a value type.

Here, if we use a bit of logic, then we can think of a reason not to have the preceding implementation, in other words it's better to make string a class than a struct.

The first reason we can thought of preventing boxing.

If a string were a value type, then every time you passed it to some method expecting an object then it would need to be boxed and that would create a new object that would again occupy some memory in the heap and cause pointless garbage collection pressure. Since strings are basically everywhere, having them result in boxing all the time would be a big problem for the compiler.

Apart from that a string could override Equals regardless of whether it's a reference type or value type.

But if a string were a value type then the result of the method ReferenceEqual(“India”,”india”) is always False because both of the passed arguments go through boxing and it can never happen that two boxed arguments have the same reference.

So even though it's true that if one could define a value type to act just like a reference type then by having it it consists of a single reference type field, it would still not be exactly the same.

A reference Type can be of null but a value type cannot contain the null value except nullable type

One question may arise here. Can we store a reference type to a value type?

The answer is we cannot store a reference to a variable in a field or array because the implementation of the CLR requires that a reference to a variable either be in:
  1. A formal parameter
  2. A local
  3. The return type of a method.

As in the Microsoft implementation, C# supports the first clause, but not the other two.


Recommended Free Ebook
Similar Articles