Deep Dive Into C# - Garbage Collection And Disposal - Part Two

Okay, we are back to collect "garbage" properly. If you haven't read the first part of this post, you might want to read it here. We are following norms and conventions from the first post here too.

Starting things from where we left off

This write up continues after the first one where we ended things with SafeHandle class and IDisposable interface. I do see how intriguing this is, of course. But before that, like an unexpected, miserable commercial break before the actual stuff comes on your TV, let's go to have a look at something called finalization.

What you create, you "might" need to destroy

If you know a little bit of C++, you'd know that there is a thing called destructor and as the name implies, it does the exact opposite of what a constructor does. C++ kind of needs it since it doesn't really have a proper garbage collection paradigm. But that also raises the question of whether C# has a destructor and if it does what does it do and why do we even have one since we said managed resourced would be collected automatically.

Lets jump into some code, shall we?

  1. class Cat  
  2. {  
  3.     ~Cart()  
  4.     {  
  5.         // cleanup things  
  6.     }  
  7. }  
Wait a minute! Now, this is getting confusing. I have IDisposable::Dispose() and I have a destructor. Both look like they have same responsibility. Not exactly. Before we confuse ourselves more, let's dig a bit more in the code. To be honest, C# doesn't really have a destructor, it's basically a syntactic sugar and inside C sharp compiler translates this segment as -
  1. protected override void Finalize()  
  2. {  
  3.     try  
  4.     {  
  5.         // Clean things here  
  6.     }  
  7.     finally  
  8.     {  
  9.         base.Finalize();  
  10.     }  
  11. }   
We can pick up to things here. First thing is that the destructor essentially translates to a overridden Finalize() method; and it also calls the Finalize() base class in the finally block. So, it's essentially a recursive finalize(). Wait, we have no clue what is Finalize().  

Finalize, who art thou?

Finalize is somebody who is blessed to talk with the garbage collector. If you have already been questioning where the topics we discussed in  the last post come in to help, this is the segment. Let's start getting the answers for all our confusions on the last segment. First, why is  Finalize overridden and where does it get its signature? Finalize gets its signature from Object class. Okay, understandable. What would be the default implementation then? Well, it doesn't have a default implementation. You might ask why. A little secret to slip in here, remember the mark step of garbage collector. He marks a type instance for finalization if and only if it has overridden the Finalize() method. It's your way of telling the garbage collector that you need to do some more work before you can reclaim my memory. Garbage collector would mark this guy and put him in a finalization queue which is essentially a list of objects whose finalization codes must be run before GC can reclaim their memory. So, you are possibly left with one more confusion now, that is, you understood garbage collector uses this method to finalize things - doing the necessary cleanup. Then, why do we need IDisposable where we can just override finalize and be done with it, right?

Dispose, you confusing scum!

Turns out dispose intends to point out a pattern in code that you can use to release unmanaged memory. What I didn't tell you about garbage collector before is that you don't really know when it would essentially do that, you practically have no clue. That also means, if you write an app that uses unmanaged resources like crazy- like reading 50-60 files at a time, you are using a lot of scarce resources at the same time. And in the end you are waiting for GC to do his job but that guy has no time table. So, hoarding these resources is not a good idea in the meantime. Since releasing unmanaged resources are the developers' duty, putting that over a finalize method and waiting for GC to come and invoke that is a stupid way to go. Moreover, if you send an instance to a finalization queue, it means, GC will essentially do the memory cleanup in the next round, he will only invoke finalize this round. That also means GC has to visit you twice to clear off your unused occupied memory which you kinda need now. And the fact that you might want to release the resources NOW is a pretty good reason itself to not wait for GC to do your dirty work. And, when you actually shut down your application, Mr. GC visits you unconditionally and takes out EVERYONE he finds. I hope the need of Dispose() is getting partially clear to you now. We need a method SomeMethod() that we can call which would clean up the unmanaged resources. If we fail to call that at some point, just to make sure garbage collector can call that we will use that same method inside Finalize() so it is called anyhow. If I have not made a fool out of myself at this point, you have figured out the fact that SomeMethod() is Dispose(). Okay, so we know what we are going to do. Now we need to act on it. We will implement the dispose pattern we have been talking about. The first thing we would do here is we would try to write code that reads a couple of lines from a file. We would try to do it the unmanaged way, then move it to slowly to the managed way and in the process we would see how we can use IDisposable there too.

Doing things the unsafe way

I stole some code from a very old msdn doc which describes a FileReader class like the following,

  1. using System;  
  2. using System.Runtime.InteropServices;  
  3. public class FileReader  
  4. {  
  5.     const uint GENERIC_READ = 0x80000000;  
  6.     const uint OPEN_EXISTING = 3;  
  7.     IntPtr handle;  
  8. [DllImport("kernel32", SetLastError = true)]  
  9.     static extern unsafe IntPtr CreateFile(  
  10.             string FileName,                    // file name  
  11.             uint DesiredAccess,                 // access mode  
  12.             uint ShareMode,                     // share mode  
  13.             uint SecurityAttributes,            // Security Attributes  
  14.             uint CreationDisposition,           // how to create  
  15.             uint FlagsAndAttributes,            // file attributes  
  16.             int hTemplateFile                   // handle to template file  
  17.             );  
  18. [DllImport("kernel32", SetLastError = true)]  
  19.     static extern unsafe bool ReadFile(  
  20.             IntPtr hFile,                       // handle to file  
  21.             void* pBuffer,                      // data buffer  
  22.             int NumberOfBytesToRead,            // number of bytes to read  
  23.             int* pNumberOfBytesRead,            // number of bytes read  
  24.             int Overlapped                      // overlapped buffer  
  25.             );  
  26. [DllImport("kernel32", SetLastError = true)]  
  27.     static extern unsafe bool CloseHandle(  
  28.             IntPtr hObject   // handle to object  
  29.             );  
  30.     public bool Open(string FileName)  
  31.     {  
  32.         // open the existing file for reading  
  33.         handle = CreateFile(  
  34.                 FileName,  
  35.                 GENERIC_READ,  
  36.                 0,  
  37.                 0,  
  38.                 OPEN_EXISTING,  
  39.                 0,  
  40.                 0);  
  41. if (handle != IntPtr.Zero)  
  42.             return true;  
  43.         else  
  44.             return false;  
  45.     }  
  46.     public unsafe int Read(byte[] buffer, int index, int count)  
  47.     {  
  48.         int n = 0;  
  49.         fixed (byte* p = buffer)  
  50.         {  
  51.             if (!ReadFile(handle, p + index, count, &n, 0))  
  52.                 return 0;  
  53.         }  
  54.         return n;  
  55.     }  
  56.     public bool Close()  
  57.     {  
  58.         // close file handle  
  59.         return CloseHandle(handle);  
  60.     }  
  61. }  
Please remember, you need to check your Allow Unsafe Code checkbox in your build properties before you start using this class. Let's have a quick run on the code pasted here. I don't intend to tell everything in details here because that is not the scope of this article. But, we will build up on it, so we need to know a little bit. The DllImport attribute here is essentially something you would need to use on an external dll (thus, unmanaged) and map the functions inside it to your own managed class. You can also see that's why we have used the extern keyword here. The implementations of these methods doesn't live in your code and thus your garbage collector can't take responsibility of clean up here. :) The next thing you would notice is the fixed statement. Fixed statement essentially links up a managed type to an unsafe one and thus make sure GC doesn't move the managed type when it collects. So, the managed one stays in one place and points to the unmanaged resource perfectly. So, what are we waiting for? Let's read a file.
  1. static int Main(string[] args)  
  2. {  
  3.     if (args.Length != 1)  
  4.     {  
  5.         Console.WriteLine("Usage : ReadFile <FileName>");  
  6.         return 1;  
  7.     }  
  8.     if (!System.IO.File.Exists(args[0]))  
  9.     {  
  10.         Console.WriteLine("File " + args[0] + " not found.");  
  11.         return 1;  
  12.     }  
  13.     byte[] buffer = new byte[128];  
  14.     FileReader fr = new FileReader();  
  15.     if (fr.Open(args[0]))  
  16.     {  
  17.         // Assume that an ASCII file is being read  
  18.         ASCIIEncoding Encoding = new ASCIIEncoding();  
  19.         int bytesRead;  
  20.         do  
  21.         {  
  22.             bytesRead = fr.Read(buffer, 0, buffer.Length);  
  23.             string content = Encoding.GetString(buffer, 0, bytesRead);  
  24.             Console.Write("{0}", content);  
  25.         }  
  26.         while (bytesRead > 0);  
  27.         fr.Close();  
  28.         return 0;  
  29.     }  
  30.     else  
  31.     {  
  32.         Console.WriteLine("Failed to open requested file");  
  33.         return 1;  
  34.     }  
  35. }  

So, this is essentially a very basic console app and looks somewhat okay. I have created a byte array of size 128 which I would use as a buffer when I read. FileReader returns 0 when it can't read anymore. Don't get confused seeing this.

while (bytesRead > 0)

It's all nice and dandy to be honest. And it works too. Invoke the application (in this case the name here is TestFileReading.exe) like the following.

TestFileReading.exe somefile.txt

And it works like a charm. But what I did here is, we closed the file after use. What if something happens in the middle, something like the file not being available. Or I throw an exception in the middle. What will happen is, the file would not be closed up until my process is not closed. And the GC will not take care of it because it doesn't have anything in the Finalize() method.

Making it safe

  1. public class FileReader: IDisposable  
  2. {  
  3.     const uint GENERIC_READ = 0x80000000;  
  4.     const uint OPEN_EXISTING = 3;  
  5.     IntPtr handle = IntPtr.Zero;  
  6.     [DllImport("kernel32", SetLastError = true)]  
  7.     static extern unsafe IntPtr CreateFile(  
  8.             string FileName,                 // file name  
  9.             uint DesiredAccess,             // access mode  
  10.             uint ShareMode,                // share mode  
  11.             uint SecurityAttributes,      // Security Attributes  
  12.             uint CreationDisposition,    // how to create  
  13.             uint FlagsAndAttributes,    // file attributes  
  14.             int hTemplateFile          // handle to template file  
  15.             );  
  16.     [DllImport("kernel32", SetLastError = true)]  
  17.     static extern unsafe bool ReadFile(  
  18.             IntPtr hFile,                   // handle to file  
  19.             void* pBuffer,                 // data buffer  
  20.             int NumberOfBytesToRead,      // number of bytes to read  
  21.             int* pNumberOfBytesRead,     // number of bytes read  
  22.             int Overlapped              // overlapped buffer  
  23.             );  
  24.     [DllImport("kernel32", SetLastError = true)]  
  25.     static extern unsafe bool CloseHandle(  
  26.             IntPtr hObject   // handle to object  
  27.             );  
  28.     public bool Open(string FileName)  
  29.     {  
  30.         // open the existing file for reading  
  31.         handle = CreateFile(  
  32.                 FileName,  
  33.                 GENERIC_READ,  
  34.                 0,  
  35.                 0,  
  36.                 OPEN_EXISTING,  
  37.                 0,  
  38.                 0);  
  39.         if (handle != IntPtr.Zero)  
  40.             return true;  
  41.         else  
  42.             return false;  
  43.     }  
  44.     public unsafe int Read(byte[] buffer, int index, int count)  
  45.     {  
  46.         int n = 0;  
  47.         fixed (byte* p = buffer)  
  48.         {  
  49.             if (!ReadFile(handle, p + index, count, &n, 0))  
  50.                 return 0;  
  51.         }  
  52.         return n;  
  53.     }  
  54.     public bool Close()  
  55.     {  
  56.         // close file handle  
  57.         return CloseHandle(handle);  
  58.     }  
  59.     public void Dispose()  
  60.     {  
  61.         if (handle != IntPtr.Zero)  
  62.             Close();  
  63.     }  
  64. }  

Now, in our way towards making things safe, we implemented IDisposable here. That exposed Dispose() and the first thing I did here is we checked whether the handle is IntPtr.Zero and if it's not we invoked Close(). Dispose() is written this way because it should be invokable in any possible time and it shouldn't throw any exception if it is invoked multiple times. But is it the solution we want? Look closely. We wanted to have a Finalize() implementation that will essentially do the same things if somehow Dispose() is not called. Right?

Enter the Dispose(bool) overload. We want the parameterless Dispose() to be used by only the external consumers. We would issue a second Dispose(bool) overload where the boolean parameter indicates whether the method call comes from a Dispose method or from the finalizer. It would be true if it is invoked from the parameterless Dispose() method.

With that in mind our code would eventually be this,
  1. public class FileReader: IDisposable  
  2. {  
  3.     const uint GENERIC_READ = 0x80000000;  
  4.     const uint OPEN_EXISTING = 3;  
  5.     IntPtr handle = IntPtr.Zero;  
  6.     private bool isDisposed;  
  7.   
  8.     SafeHandle safeHandle = new SafeFileHandle(IntPtr.Zero, true);  
  9.   
  10.     [DllImport("kernel32", SetLastError = true)]  
  11.     static extern unsafe IntPtr CreateFile(  
  12.           string FileName,                  // file name  
  13.           uint DesiredAccess,              // access mode  
  14.           uint ShareMode,                 // share mode  
  15.           uint SecurityAttributes,       // Security Attributes  
  16.           uint CreationDisposition,     // how to create  
  17.           uint FlagsAndAttributes,     // file attributes  
  18.           int hTemplateFile           // handle to template file  
  19.           );  
  20.   
  21.     [DllImport("kernel32", SetLastError = true)]  
  22.     static extern unsafe bool ReadFile(  
  23.          IntPtr hFile,                // handle to file  
  24.          void* pBuffer,              // data buffer  
  25.          int NumberOfBytesToRead,   // number of bytes to read  
  26.          int* pNumberOfBytesRead,  // number of bytes read  
  27.          int Overlapped           // overlapped buffer  
  28.          );  
  29.   
  30.     [DllImport("kernel32", SetLastError = true)]  
  31.     static extern unsafe bool CloseHandle(  
  32.           IntPtr hObject   // handle to object  
  33.           );  
  34.   
  35.     public bool Open(string FileName)  
  36.     {  
  37.         // open the existing file for reading  
  38.         handle = CreateFile(  
  39.               FileName,  
  40.               GENERIC_READ,  
  41.               0,  
  42.               0,  
  43.               OPEN_EXISTING,  
  44.               0,  
  45.               0);  
  46.   
  47.         if (handle != IntPtr.Zero)  
  48.             return true;  
  49.         else  
  50.             return false;  
  51.     }  
  52.   
  53.     public unsafe int Read(byte[] buffer, int index, int count)  
  54.     {  
  55.         int n = 0;  
  56.         fixed (byte* p = buffer)  
  57.         {  
  58.             if (!ReadFile(handle, p + index, count, &n, 0))  
  59.                 return 0;  
  60.         }  
  61.         return n;  
  62.     }  
  63.   
  64.     public bool Close()  
  65.     {  
  66.         // close file handle  
  67.         return CloseHandle(handle);  
  68.     }  
  69.   
  70.     public void Dispose()  
  71.     {  
  72.         Dispose(true);  
  73.         GC.SuppressFinalize(this);  
  74.     }  
  75.   
  76.     protected virtual void Dispose(bool isDisposing)  
  77.     {  
  78.         if (isDisposed)  
  79.             return;  
  80.   
  81.         if (isDisposing)  
  82.         {  
  83.             safeHandle.Dispose();  
  84.         }  
  85.   
  86.         if (handle != IntPtr.Zero)  
  87.             Close();  
  88.   
  89.         isDisposed = true;  
  90.     }  
  91. }  

Now, if you focus on the changes we made, here is introducing the following method,

protected virtual void Dispose(bool isDisposing)

Now, this method envisions what we discussed a moment earlier. You can invoke it multiple times without any issue. There are two prominent blocks here.

  • The conditional block is supposed to free managed resources (Read invoking Dispose() methods of other IDisposable member/properties inside the class, if we have any.)
  • The non-conditional block frees the unmanaged resources.

You might ask why the conditional block tries to dispose managed resources. The GC takes care of that anyway right? Yes, you're right. Since garbage collector is going to take care of the managed resources anyway, we are making sure the managed resources are disposed on demand if only someone calls the parameter less Dispose().

Are we forgetting something again? Remember, you have unmanaged resources and if somehow the Dispose() is not invoked you still have to make sure this is finalized by the garbage collector. Let's write up a one line destructor here.

  1. ~FileReader()  
  2. {  
  3.    Dispose(false);  
  4. }  

It's pretty straightforward and it complies with everything we said before. Kudos! We are done with FileReader.

Words of Experience

Although we are safe already, we indeed forgot one thing. If we invoke Dispose() now it will dispose unmanaged and managed resources both. That also means when the garbage collector will come to collect he will see there is a destructor ergo there is a Finalize() override here. So, he would still put this instance into Finalization Queue. That kind of hurts our purpose. Because we wanted to release memory as soon as possible. If the garbage collector has to come back again, that doesn't really make much sense. So, we would like to suppress the garbage collector to invoke Finalize() if we know we have disposed it ourselves. And a single line modification to the Dispose() method would allow you to do so.

  1. public void Dispose()  
  2. {  
  3.     Dispose(true);  
  4.     GC.SuppressFinalize(this);  
  5. }  

We added the following statement to make sure if we have done disposing ourselves the garbage collector would not invoke Finalize() anymore.

GC.SuppressFinalize(this);

Now, please keep in mind that you shouldn't write a GC.SupperssFinalize() in a derived class since your dispose method would be overridden and you would follow the same pattern and call base.Dispose(isDisposing) in the following way,

  1. class DerivedReader : FileReader  
  2. {  
  3.    // Flag: Has Dispose already been called?  
  4.    bool disposed = false;  
  5.   
  6.    // Protected implementation of Dispose pattern.  
  7.    protected override void Dispose(bool disposing)  
  8.    {  
  9.       if (disposed)  
  10.          return;   
  11.   
  12.       if (disposing) {  
  13.          // Free any other managed objects here.  
  14.          //  
  15.       }  
  16.   
  17.       // Free any unmanaged objects here.  
  18.       //  
  19.       disposed = true;  
  20.   
  21.       // Call the base class implementation.  
  22.       base.Dispose(disposing);  
  23.    }  
  24.   
  25.    ~DerivedClass()  
  26.    {  
  27.       Dispose(false);  
  28.    }  
  29. }  

It should be fairly clear now why we are doing it this way. We want disposal to go recursively to base class. So, when we dispose the derived class resources, the base class disposes its own resources too.

To use or not to use a Finalizer

We are almost done, really, we are. This is the section where I'm supposed to tell you why you really shouldn't use unmanaged resources whenever you need. It's always a good idea not to write a Finalizer if you really really don't need it. Currently we need it because you are using a unsafe file handle and we need to close it manually. To keep destructor free and as managed as possible, we should always wrap our handles in SafeHandle class and dispose the SafeHandle as a managed resource, thus eliminating the need for cleaning unmanaged resources and the overloaded Finalize() . You will find more about that here.

"Using" it right

Before you figure out why I quoted the word using here, let's finally wrap our work up. We have made our FileReader class disposable and we would like to invoke dispose() after we are done using it. We would opt for a try-catch-finally block to do it and will dispose the resources in the finally block.

  1. FileReader fr = new FileReader();  
  2. try  
  3. {  
  4.     if (fr.Open(args[0]))  
  5.     {  
  6.         // Assume that an ASCII file is being read  
  7.         ASCIIEncoding Encoding = new ASCIIEncoding();  
  8.         int bytesRead;  
  9.         do  
  10.         {  
  11.             bytesRead = fr.Read(buffer, 0, buffer.Length);  
  12.             string content = Encoding.GetString(buffer, 0, bytesRead);  
  13.             Console.Write("{0}", content);  
  14.         }  
  15.         while (bytesRead > 0);  
  16.         return 0;  
  17.     }  
  18.     else  
  19.     {  
  20.         Console.WriteLine("Failed to open requested file");  
  21.         return 1;  
  22.     }  
  23. }  
  24. finally  
  25. {  
  26.     if (fr != null)  
  27.         fr.Dispose();  
  28. }  

The only difference you see here that we don't explicitly call Close() anymore because that is already handled when we are disposing the FileReader instance.

Good thing for you is that C# has essentially made things even easier than this. Remember the using statements we used in Part-1? A "Using statement" is basically a syntactic sugar placed on a try-finally block with a call to Dispose() in the finally block just like we wrote it here. Now, with that in mind, our code-block will change to:

  1. using (FileReader fr = new FileReader())  
  2. {  
  3.     if (fr.Open(args[0]))  
  4.     {  
  5.         // Assume that an ASCII file is being read  
  6.         ASCIIEncoding Encoding = new ASCIIEncoding();  
  7.         int bytesRead;  
  8.         do  
  9.         {  
  10.             bytesRead = fr.Read(buffer, 0, buffer.Length);  
  11.             string content = Encoding.GetString(buffer, 0, bytesRead);  
  12.             Console.Write("{0}", content);  
  13.         }  
  14.         while (bytesRead > 0);  
  15.         return 0;  
  16.     }  
  17.     else  
  18.     {  
  19.         Console.WriteLine("Failed to open requested file");  
  20.         return 1;  
  21.     }  
  22. }  

Now, you can go back to part I and try to understand the first bits of code we saw. I hope, it will make a bit more sense to you now. Hopefully, if there's a next part, I will talk about garbage collection algorithms.

You can find the code sample over github here and if you prefer to download it, you can do so too.