C# - String Object Impact On Performance

In this article, I am going to explain about "STRING" one of the common object which store the collection of character and every programmer use it in day-to-day coding. String is immutable means it's read-only when an object is created and every time, we are concating or doing something, CLR will create a new object means a new memory will be allocated.

String is a reference type, so memory is created in managed heap. Operation like extracting part of a string, copy of string data, converting to upper case, and concatenating string which often leads to excess memory allocation due to immutable nature. So when we think about high performance code which includes less execution time and memory optimization, string manipulation, and memory allocation worth considering.

I will be talking about 4 different ways to work with String. These are ways to do things, but we should always consider any option based on scenario and best fit. It is not hard and fast to use the best option which in future will cause problem in product or rather than giving best benefits it will impact the performance.

Let's take a scenario - I have a string "CodeWithSharad", I want to hash all other chars after "Code" so I want a final output as "Code**********".

1. Basic Approach

 I will substring the input and take "Code" in one variable and then append "*" till the length of input so will get "Code**********" as our output.         

public string MaskLastTenCharater_BasicApproch()
{
     var charToRetain = pwd.Substring(0, 4);
     var remaningStringLength = pwd.Length - 4;

      for (var i = 0; i < remaningStringLength; i++)
      {
          charToRetain += "*";
      }

      return charToRetain;
}

If you see the line "charToRetain += "*", it will allocate 10 memory slots in heap. So here is our scope of improvement. Let's see the memory diagnosis result on this method. 

We can see that it takes 283.5 NS and is allocated 472 bytes. You may think it's very less, yes it is but it is just a scenario and only one operation. In real-time product we have to do string manipulation many places and that may be include much more complex logic.

Let's optimize the above code.

2. Using StringBuilder

This approach is one of the efficient ways to do string manipulation and in most of the scenario it will work perfectly. StringBuilder dynamically extent memory to accommodate the changes and it will not allocate memory every time.

public string MaskLastTenCharater_StringBuilder_Approch()
{
     var charToRetain = pwd.Substring(0, 4);
     var remaningStringLength = pwd.Length - 4;
     var stringBuilder = new StringBuilder(charToRetain);

     for (var i = 0; i < remaningStringLength; i++)
     {
        stringBuilder.Append("*");
     }

      return stringBuilder.ToString();
}

So in the above code, till we do not call. ToString() memory will be not allocated. Let's see the memory diagnosis result on this method.

If we compare this approach with the first one, it's taken less than half the time to execute and also less than half the memory allocated. So it is a more optimized way to do string manipulation. This is not, we still have two more ways to do.

3. Using String Constructor

String has a one of constructors which takes characters and length then repeat the characters as per length.

public string MaskLastTenCharater_Constructor_Approch()
{
    var charToRetain = pwd.Substring(0, 4);
    var remaningStringLength = pwd.Length - 4;
    var hashedChars = new string('*', remaningStringLength);
    return charToRetain + hashedChars;
}

So we can see that it is more efficient than the above two approaches in terms of execution time and memory allocation. Still not the best, we have one more option to do things in a more optimized way.

4. Using String Create Method

First thing first, this approach is the most efficient way but this can not be used the day to day development activity. It has some use case where you can use it so before try to implement please be sure that it fits on requirement.

Magically this Create method allows us to break the immutability rule of string. Yes sounds crazy right but this is the main power, will explain it in a bit. Let us see the code first.

public string MaskLastTenCharater_CreateMethod_Approch()
{
    return string.Create(pwd.Length, pwd, (span, value) =>
    {
         value.AsSpan().CopyTo(span);
         span[4..].Fill('*');
    });
}

Length: Final output of string. So we need to know upfront what will be the desired length of string.

TSate: Generic state needed to construct the string.

Delegate: This delegate is expected to operate on allocated heap memory to form the final string data. It is not relocating any memory but using the memory which is assigned to that particular char.

So the line number 5 in above block is just copying the input value to Span which is nothing but the each char. The line number on 6 then take the first four char and rest will be filled with '*' so we will get the "Code**********"

As we can see this approach is the most efficient than all other approaches. Execution time is 32.02NS which is less than others and also the memory allocation is 56Bytes.

Conclusion

Let's run all benchmark in one go and then see, it may be given a different time of execution.

It is clear that create method approach is 10 times efficient in terms of execution time as well as in memory allocation. But that does not mean we have to use it everywhere. It also has some cons and restrictions so choose wisely.

HAPPY CODING.....