Implementing a Simple Garbage Collector in C#

Introduction

Garbage collection is a process in computer programming and memory management where the system automatically identifies and frees up memory that is no longer in use by the program. The primary purpose of garbage collection is to reclaim memory occupied by objects or data structures that are no longer reachable or referenced by the program, preventing memory leaks and improving the efficiency of memory utilization.

We will delve into the core principles and advantages of this paradigm and simply implement (in C#) the theoretical concepts that we address, gaining insights into the limitations of each concept. Ultimately, we aim to develop a comprehensive understanding of how garbage collection works in modern programming languages.

The authoritative textbook on this topic is The Garbage Collection Handbook published in 2023. This book encapsulates the wealth of knowledge amassed by researchers and developers in the field of automatic memory management over the past sixty years.

What Is garbage collection exactly?

As said in the introduction, garbage collection is a process in computer programming and memory management where the system automatically identifies and frees up memory that is no longer in use by the program.

Key Concepts of Garbage Collection

  • Garbage collection automates the process of deallocating memory, eliminating the need for manual memory management by programmers. This helps prevent common issues such as memory leaks and dangling pointers (see below).
  • Objects are considered reachable if they are directly or indirectly referenced by the program. Garbage collection identifies and marks the objects that are no longer reachable, making them eligible for removal.
  • The garbage collector identifies and releases memory occupied by unreferenced objects, returning it to the pool of available memory for future use.
  • Garbage collection can be triggered by various events, such as when the system is running low on memory or when a specific threshold of allocated memory is reached.

Garbage collection is universally adopted by nearly all modern programming languages:

  • Java utilizes the JVM's garbage collector, which includes various algorithms such as the generational garbage collector (G1 GC).
  • .NET (C#) implements a garbage collector that uses a generational approach.

Garbage collection contributes to the overall robustness and reliability of programs by automating memory management and reducing the likelihood of memory-related errors.

What is the problem with manual (explicit) memory management?

Manual memory management is not inherently problematic and is still employed in certain prominent programming languages such as C++. However, many are familiar with instances of critical bugs attributable to memory leaks. This is why, relatively early in the history of computer science (with the first technique dating back to the 1950s), methods for automatically managing memory were studied.

Manual memory management indeed introduces several challenges and potential issues:

  • Failure to deallocate memory properly can lead to memory leaks. Memory leaks occur when a program allocates memory but fails to release it, resulting in a gradual depletion of available memory over time.
  • Improper handling of pointers can result in dangling pointers, pointing to memory locations that have been deallocated. Accessing such pointers can lead to undefined behavior and application crashes.
  • Incorrect manipulation of memory, such as writing beyond the bounds of allocated memory, can cause memory corruption. Memory corruption can result in unexpected behavior, crashes, or security vulnerabilities.
  • Manual memory management increases code complexity, making programs more prone to bugs and harder to maintain. Developers need to carefully track and manage memory allocations and deallocations throughout the codebase.
  • Manual memory management can contribute to memory fragmentation, where free memory is scattered in small, non-contiguous blocks. This fragmentation can lead to inefficient memory utilization over time.
  • Mismatched allocation and deallocation operations can occur, leading to errors like double frees or memory leaks. These errors are challenging to identify and debug.
  • Explicit memory management can require additional code for allocation and deallocation, impacting program performance. Memory allocation and deallocation operations may become bottlenecks in performance-critical applications.
  • Manual memory management lacks the safety features that are present in automatic memory management systems. Developers need to handle memory-related operations carefully, making programs more error-prone.

Automatic memory management addresses these issues by automating the process of memory allocation and deallocation, reducing the burden on developers and minimizing the likelihood of memory-related errors.

A more comprehensive overview of the benefits of garbage collection and drawbacks of explicit memory management is provided in The Garbage Collector Handbook.

What are stacks, heaps, and more?

When discussing garbage collection, terms such as stack and heap are frequently mentioned to the extent that it may create the impression that they are inherent data structures embedded in the operating system and, thus, must be universally adopted without question. However, the reality is that heaps and stacks are just implementation details. In this section, our aim is to elucidate and clarify all these concepts.

In the beginning was the Word

A program is essentially a text comprising instructions that a computer needs to comprehend. The compiler's responsibility is to translate these high-level instructions into executable code.

Compiler

In contemporary programming languages, a virtual execution engine is often interposed between the source code and the operating system. Additionally, the source code is initially translated into an intermediate language to enhance portability and adaptability to different environments.

Garbage compiler

This execution engine encompasses the garbage collector and other techniques (like typechecking, exception management, etc.) to manage the program. The ultimate translation into target code is executed by the just-in-time (JIT) compiler.

Focus on the garbage collector

Certainly, the garbage collector stands out as one of the most crucial components of the execution engine. In this context, our focus will be on describing how memory is organized to facilitate effective management.

A program becomes valuable only when it generates a result, and inevitably, it needs to declare variables and allocate memory to store information at some point.

  • The execution engine can anticipate the space it requires in advance, such as when dealing with simple types like integers. Conversely, certain data types, including custom classes or built-in types like strings or arrays of objects, do not allow for such straightforward inference.
  • The compiler might only ascertain at runtime the specific location to store variables and reserve memory at that moment (dynamic allocation). If the compiler can determine the variable's lifetime, especially if it is local to a procedure, it becomes simpler to understand when the variable will no longer be necessary (i.e., when deallocation can occur). Conversely, data may persist beyond the call to a function or subroutine and endure indefinitely.

The disparity between a variable's lifetime and the space known in advance to be required necessitates the distinction between two memory areas. Although this differentiation is not obligatory, it has become universally accepted and implemented. A programming language that chooses not to adopt this abstraction may face the risk of fading into obsolescence, but it's essential to acknowledge that alternative schemes are conceivable. Moreover, it's crucial to note that this perspective is solely valid within the context of the execution engine. From the operating system's standpoint, everything eventually boils down to memory chunks allocated somewhere.

  • When the execution engine can anticipate the lifetime of a variable and infer the required space, it allocates it in a memory area internally referred to as the stack.
  • In other cases, where the lifetime cannot be definitively decided, or the required space is unclear, the execution engine allocates variables in memory areas referred to as the heap.

Disclaimer: This depiction provides only a basic overview of what stacks and heaps are in various programming languages, and purists may find it oversimplified. It is by no means an official definition, as there could be various implementations and interpretations of these abstractions. The key takeaway is the necessity to delineate two memory areas at the execution engine level to facilitate efficient memory management.

Heap and Stack

Enough theory, code, please!

One should envision a scenario where a program is initially written in source code and subsequently compiled into an intermediate representation, manifested as a sequential list of instructions. We will assume that the IL compiler has generated this list of instructions. An example of such a list is provided below.

thread1;CREATE_THREAD; 
thread1;PUSH_ON_STACK;Jubilant 
thread2;CREATE_THREAD; 
thread1;PUSH_ON_STACK;Radiant 
thread1;POP_FROM_STACK
...

Each instruction consists of three components. The first indicates which thread is affected by the instruction, the second describes the operation to be performed, and the third (optional) provides a value to complement the operation. In the provided example, the compiler is initially instructed to create a thread named thread1 and then push a variable named Jubilant onto the stack of this thread. The third instruction calls for the creation of another thread named thread2, and so forth. The fifth instruction attempts to pop the stack of thread1.

This list of instructions can be envisioned as an intermediate representation (IR), as illustrated in Compilers: Principles, Techniques, and Tools.

Our current task is to implement an execution engine (runtime) capable of handling these instructions, with a particular focus on memory allocation and deallocation.

Implementing the runtime

Without further delay, here is how the runtime is implemented in our code.

public class Runtime
{
    private List<Instruction> _instructions;
    private Collectors.Interfaces.GarbageCollector _collector;

    public Runtime(List<Instruction> instructions, string collectorStrategy)
    {
        _instructions = instructions;
        _collector = GetCollector(collectorStrategy);
    }

    #region Public Methods

    public void Run()
    {
        foreach (var instruction in _instructions)
        {
            _collector.ExecuteInstruction(instruction);
            _collector.PrintMemory();
        }
    }

    #endregion
}

The execution engine possesses the list of instructions to execute and a garbage collector (described below) to be utilized during memory deallocation. It also includes a Run method called by the main thread.

public class Program
{
    static void Main(string[] args)
    {
        var instructions = new List<Instruction>();            
        var contents = File.ReadAllLines(AppContext.BaseDirectory + "/instructions.txt");
        foreach (var line in contents)
        {
            var c = line.Split(';');
            var intruction = new Instruction(c[0], c[1], c[2]);
            instructions.Add(intruction);
        }
        
        var runtime = new Runtime(instructions, "MARK_AND_COMPACT");

        runtime.Run();
    }
}

The Runtime class also requires a memory collection strategy to apply. As we will explore, there are several possibilities for memory collection, and designing an effective garbage collector is indeed a subtle art.

Defining the GarbageCollector interface

The Runtime class requires a GarbageCollection instance to operate. This instance implements the GarbageCollection abstract class, whose methods are detailed below.

public abstract class GarbageCollector
{ 
    public abstract void ExecuteInstruction(Instruction instruction);

    public abstract int Allocate(string field);

    public abstract void Collect();

    public abstract void PrintMemory();        
}

In essence, the GarbageCollector class presented here serves as an orchestrator to execute the list of instructions and manage memory allocation and collection. In real-world implementations, these abstractions would typically be kept separate, but for simplicity's sake, the goal here is to streamline the representation.

The PrintMemory method is merely a helper function designed to visualize the content of the heap during debugging.

Defining a thread

A thread refers to the smallest unit of execution within a process. A process is an independent program that runs in its own memory space, and within a process, multiple threads can exist. Each thread shares the same resources, such as memory, with other threads in the same process.

  • Threads enable concurrent execution of tasks, allowing multiple operations to be performed simultaneously. They operate independently, with their own program counter, registers, and stack, but they share the same memory space. Threads within the same process can communicate and synchronize with each other.
  • Threads are widely used in modern software development to improve performance, responsiveness, and resource utilization in applications ranging from desktop software to server applications and distributed systems.
public class RuntimeThread
 {
     private Collectors.Interfaces.GarbageCollector _collector;

     private Stack<RuntimeStackItem> _stack;
     private Dictionary<int, RuntimeStackItem> _roots;
     private int _stackSize;
     private int _index;

     public string Name { get; set; }

     public RuntimeThread(Collectors.Interfaces.GarbageCollector collector, int stackSize, string name)
     {
         _collector = collector;

         _stack = new Stack<RuntimeStackItem>(stackSize);
         _roots = new Dictionary<int, RuntimeStackItem>();
         _stackSize = stackSize;
         _index = 0;

         Name = name;
     }

     public Dictionary<int, RuntimeStackItem> Roots => _roots;        

     public void PushInstruction(string value)
     {
         _index++;
         if (_index > _stackSize) throw new StackOverflowException();

         var address = _collector.Allocate(value); 
         var stackItem = new RuntimeStackItem(address);

         _stack.Push(stackItem);
         _roots.Add(address, stackItem);
     }

     public void PopInstruction()
     {
         _index--;

         var item = _stack.Pop();
         _roots.Remove(item.StartIndexInTheHeap);
     }
 }

A thread is characterized by a name and executes its own list of instructions. The crucial point is that threads share the same memory. This translates into the fact that in a program, there can be multiple threads and, consequently, multiple stacks, but there is generally one heap shared among all threads.

In the provided example, when a thread executes an instruction, it is placed on its stack, and any potential reference to the heap is retained in an internal dictionary. This mechanism ensures that each thread maintains its own stack while references to the shared heap are tracked to manage memory allocation and deallocation effectively.

Defining stacks and the heap

As mentioned earlier, stacks and heap are implementation details, so their code is deferred to the concrete classes of GarbageCollector. It could still be informative, nonetheless, to provide a brief description of them because, as we observed, these concepts are universally adopted.

Implementing the stack

In C#, there already exists a data structure for stacks. For simplicity, we will use it, especially since it is well-suited to our requirements.

We will push instances of a RuntimeStackItem class onto this stack. RuntimeStackItem simply contains a reference to an address in the heap.

public class RuntimeStackItem
{
    public int StartIndexInTheHeap { get; set; }

    public RuntimeStackItem(int startIndexInTheHeap)
    {
        StartIndexInTheHeap = startIndexInTheHeap;
    }
}

Implementing the heap

A heap must represent a memory area, and we will simply model each memory cell with a char class. With this abstraction, a heap is nothing more than a list of cells that can be allocated. It also includes additional plumbing code, such as pointers, to manage which cells are occupied or can be reclaimed.

public class RuntimeHeap
{
    public RuntimeHeap(int size)
    {
        Size = size;
        Cells = new RuntimeHeapCell[size];
        Pointers = new List<RuntimeHeapPointer>();

        for (var i = 0; i < size; i++)
        {
            Cells[i] = new RuntimeHeapCell();
        }
    }

    #region Properties

    public int Size { get; set; }
    public RuntimeHeapCell[] Cells { get; set; }
    public List<RuntimeHeapPointer> Pointers { get; set; }

    #endregion
}

Our task now is to define concrete classes of GarbageCollection to observe a garbage collector in action.

Implementing the mark-and-sweep algorithm

For this strategy and the subsequent ones, we will allocate a heap of 64 characters, and we will consider it impossible to claim additional space from the operating system. In a real-world scenario, the garbage collector typically has the ability to request new memory, but for illustrative purposes, we impose this limitation. It is important to note that this limitation neither hinders nor detracts from the overall understanding.

Implementing the ExecuteInstruction method

This method is nearly the same for all strategies. It processes one instruction at a time, pushing or popping variables on the stack.

public override void ExecuteInstruction(Instruction instruction)
{
    if (instruction.Operation == "CREATE_THREAD")
    {
        AddThread(instruction.Name);
        return;
    }

    var thread = _threads.FirstOrDefault(x => x.Name == instruction.Name);
    if (thread == null) return;

    if (instruction.Operation == "PUSH_ON_STACK")
    {
        thread.PushInstruction(instruction.Value);
    }
    if (instruction.Operation == "POP_FROM_STACK")
    {
        thread.PopInstruction();
    }
}

Implementing the Allocate method

If there is insufficient memory, the garbage collector attempts to reclaim memory and then retries the allocation process. If there is still not enough space, an exception is raised.

public override int Allocate(string field)
{
    var r = AllocateHeap(field);
    if (r == -1)
    {
        Collect();
        r = AllocateHeap(field);
    }

    if (r == -1) throw new InsufficientMemoryException();

    return r;
}

private int AllocateHeap(string field)
{
    var size = _heap.Size;
    var cells = field.ToArray();
    var requiredSize = cells.Length;

    // We need to find a contiguous space of requiredSize char.
    var begin = 0;
    while (begin < size)
    {
        var c = _heap.Cells[begin];
        if (c.Cell != '\0')
        {
            begin++;
            continue;
        }

        // Do not continue if there is not enough space.
        if ((size - begin) < requiredSize) return -1;

        // c = 0 => This cell is free.
        var subCells = _heap.Cells.Skip(begin).Take(requiredSize).ToArray();
        if (subCells.All(x => x.Cell == '\0'))
        {
            for (var i = begin; i < begin + requiredSize; i++)
            {
                _heap.Cells[i].Cell = cells[i - begin];
            }

            var pointer = new RuntimeHeapPointer(begin, requiredSize);
            _heap.Pointers.Add(pointer);

            return begin;
        }

        begin++;
    }

    return -1;
}

Important: The allocation process consistently seeks contiguous free space (line 23 of the source code above). For instance, if it has to allocate 8 cells, it must locate 8 free contiguous cells. This observation will prove crucial in the subsequent sections (see below).

What is the mark-and-sweep algorithm?

The mark-and-sweep algorithm consists of two main phases: marking and sweeping.

  • In the marking phase, the algorithm traverses through all the objects in the heap, starting from the roots (typically global variables or objects directly referenced by the program). It marks each visited object as "reachable" or "live" by setting a specific flag or attribute. Objects not marked during this traversal are considered unreachable and are potentially candidates for deallocation.

  • In the sweeping phase, the algorithm scans through the entire heap, identifying and deallocating memory occupied by unmarked (unreachable) objects. The memory occupied by marked (reachable) objects is retained. After sweeping, the heap is once again considered a contiguous block of free memory, ready for new allocations.

The mark-and-sweep algorithm is known for its simplicity and effectiveness in identifying and reclaiming memory that is no longer in use. However, it has some drawbacks, as we will see next.

To sum up, an object is considered dead (and therefore reclaimable) until it demonstrates otherwise by being reachable from somewhere in the program.

public override void Collect()
{
    MarkFromRoots();
    Sweep();
}

Implementing the marking phase

The mark-and-sweep algorithm must commence its traversal from roots. What constitutes these roots? Typically, they encompass global variables or references contained in the frame of each stack. In our simplified example, we will define roots as solely comprising the objects referenced in the stack.

private void MarkFromRoots()
{
    var pointers = _heap.Pointers;
    foreach (var root in _threads.SelectMany(x => x.Roots))
    {
        var startIndexInTheHeap = root.Value.StartIndexInTheHeap;

        var pointer = pointers.FirstOrDefault(x => x.StartCellIndex == startIndexInTheHeap);
        if (pointer == null) return;

        pointer.SetMarked(true);
    }
}

At the conclusion of the procedure, live objects have been marked, whereas dead ones have not.

Implementing the sweeping phase

The sweeping phase scans through the entire heap and reclaims memory from objects that are not marked.

private void Sweep()
{            
    var cells = _heap.Cells;
    var pointers = _heap.Pointers;
    foreach (var pointer in pointers.Where(x => !x.IsMarked))
    {
        var address = pointer.StartCellIndex;
        for (var i = address; i < address + pointer.AllocationSize; i++)
        {
            cells[i].Cell = '\0';
        }
    }

    var list = new List<RuntimeHeapPointer>();
    foreach (var pointer in pointers.Where(x => x.IsMarked))
    {
        pointer.SetMarked(false);
        list.Add(pointer);
    }

    pointers = list;
}

In this procedure, we don't forget to reinitialize marked pointers and set them to unmarked to ensure that the next collection works again in a clean slate.

Running the program

We will execute the program with two lists of instructions. The first one is identical to the one in the previous article. The second one will highlight a phenomenon specific to this algorithm known as fragmentation.

With the same set of instructions as before, we observe that the program runs until the end with no exception raised. The word "Cascade" was correctly allocated because the garbage collector succeeded in reclaiming sufficient memory.

mark-and-sweep-heapsnapshot

This algorithm ensures that memory is correctly reclaimed when needed. However, in the example below, we will observe an undesirable effect when we slightly modify our instructions and attempt to allocate memory for a long word (16 letters).

...
thread1;PUSH_ON_STACK;Enigmatic
thread2;POP_FROM_STACK;
thread1;PUSH_ON_STACK;Cascade
thread2;POP_FROM_STACK;
thread2;PUSH_ON_STACK;GarbageCollector <= 16-letter words

Curiously, an exception is raised even though there appears to be enough free memory in the heap.

Code

mark-and-sweep-heapsnapshot

There are 19 free cells in the heap, and we are attempting to allocate 16 cells, so why doesn't the garbage collector manage to perform the task?

In reality, the garbage collector attempts to locate 16 free CONTIGUOUS cells, but it seems that the heap does not possess them. Although it has more than 16 free cells, these are not contiguous, rendering the allocation impossible. This phenomenon is known as fragmentation and has some undesirable effects.

More generally, heap fragmentation refers to a situation where the free memory in a heap is scattered in non-contiguous blocks, making it challenging to allocate a large, contiguous chunk of memory, even if the total free space is sufficient. While the total amount of free memory may be substantial, the scattered nature of these blocks hinders the allocation of large, contiguous portions of memory.

In the context of the mark-and-sweep algorithm, the need for contiguous memory blocks for certain allocations can be impeded by the scattered arrangement of free memory cells in the heap. As a consequence, even if the total free space is adequate, the inability to find a sufficiently large, contiguous segment can lead to allocation failures.

Addressing heap fragmentation is, therefore, crucial for optimizing memory usage and improving the performance of a garbage collector. Here, we attempt to address this issue with the mark-and-compact algorithm.

What is the mark-and-compact algorithm?

The mark-and-compact algorithm addresses the issue of heap fragmentation by compacting memory, ensuring that free space is contiguous.

The algorithm consists of two main phases.

  • The marking phase is similar to the mark-and-sweep algorithm. It identifies and marks live objects in a heap, typically starting from root references and traversing through the object graph. Live objects are marked as reachable, while dead objects remain unmarked.
  • The compacting phase follows: the algorithm rearranges the live objects in memory to eliminate fragmentation. It compacts the live objects, moving them to one end of the heap, and updates references accordingly. The compacted memory region ensures that free space becomes contiguous, mitigating the impact of external fragmentation.

The algorithm is more complex to implement compared to the mark-and-sweep algorithm and introduces additional runtime overhead, as it involves moving live objects in memory. Despite its complexity, the mark-and-compact algorithm is valuable in scenarios where efficient memory usage and reduced fragmentation are critical considerations.

public override void Collect()
{
    MarkFromRoots();
    Compact();
}

Implementing the marking phase

The mark-and-compact algorithm must commence its traversal from roots. What constitutes these roots? Typically, they encompass global variables or references contained in the frame of each stack. In our simplified example, we will define roots as solely comprising the objects referenced in the stack.

private void MarkFromRoots()
{
    var pointers = _heap.Pointers;
    foreach (var root in _threads.SelectMany(x => x.Roots))
    {
        var startIndexInTheHeap = root.Value.StartIndexInTheHeap;

        var pointer = pointers.FirstOrDefault(x => x.StartCellIndex == startIndexInTheHeap);
        if (pointer == null) return;

        pointer.SetMarked(true);
    }
}

At the conclusion of the procedure, live objects have been marked, whereas dead ones have not.

Implementing the compacting phase

The compacting phase scans through the entire heap and compacts the live data by relocating objects and updating the pointer values of all live references to objects that have moved.

private void Compact()
{
    var cells = _heap.Cells;
    var pointers = _heap.Pointers;
    foreach (var pointer in pointers.Where(x => !x.IsMarked))
    {
        var address = pointer.StartCellIndex;
        for (var i = address; i < address + pointer.AllocationSize; i++)
        {
            cells[i].Cell = '\0';
        }
    }

    var list = new List<RuntimeHeapPointer>();
    foreach (var pointer in pointers.Where(x => x.IsMarked))
    {
        pointer.SetMarked(false);
        list.Add(pointer);
    }

    _heap.Pointers = pointers = list;

    var offset = 0;
    foreach (var pointer in pointers)
    {
        var begin = pointer.StartCellIndex;
        if (begin == 0)
        {
            offset = offset + pointer.AllocationSize;
            continue;
        }

        // Update roots in the stack
        // This code is perfectible.
        var thread = _threads.FirstOrDefault(x => x.Roots.Any(t => t.Key == begin));
        var root = thread.Roots.FirstOrDefault(x => x.Key == begin);
        root.Value.StartIndexInTheHeap = offset;
        thread.Roots[offset] = root.Value; thread.Roots.Remove(begin);

        // Update pointers in the heap
        pointer.StartCellIndex = offset;
        for (var i = begin; i < begin + pointer.AllocationSize; i++)
        {
            cells[offset + i - begin].Cell = cells[i].Cell;
            cells[i].Cell = '\0';
            
        }

        offset = offset + pointer.AllocationSize;
    }

    _currentPointerInTheHeap = offset;
}

The mark-and-compact algorithm introduces a trade-off between the overhead of updating live pointers and the benefit of faster allocation processes.

  • Since the compacting phase ensures contiguous free space, the allocation process becomes faster. Allocating memory simply involves adjusting a pointer to the next available free space, resulting in efficient memory utilization.
  • The process of compacting live objects involves updating references, which can introduce runtime overhead. During this phase, the algorithm needs to adjust pointers and references to reflect the new locations of live objects.

Running the program

Executing the program with the set of instructions that previously caused an exception in the mark-and-sweep algorithm reveals a noteworthy outcome: no exception is raised this time.

Running

We observe, however, that certain objects persist in the heap for an extended duration (for instance, Jubilant, in our example, has been in the heap since the program's inception) and appear to never undergo reclamation. Nevertheless, during each collection cycle, these objects are visited to determine their liveness. This observation underscores the effectiveness of dividing the heap into partitions, allowing each partition to be managed using a distinct algorithm. Among these strategies, employing generational collectors has proven highly effective by focusing the reclamation effort on the youngest objects.

What is heap partition?

A heap partition refers to a segmentation or division of the heap into distinct sections, each subject to separate memory management strategies or garbage collection algorithms. This approach is commonly employed to optimize memory management by tailoring the collection strategy to the characteristics and usage patterns of objects within each partition. Generational garbage collection, a notable example, often involves dividing the heap into two main generations: the young generation and the old generation. The young generation is typically collected more frequently due to the higher likelihood of short-lived objects, while the old generation undergoes collection less frequently. This partitioning allows for more targeted and efficient garbage collection, improving overall system performance.

This technique is based on the consideration that it's faster to compact the memory for a portion of the managed heap than for the entire managed heap.

To illustrate our point, we will implement a generational garbage collection where objects are partitioned by age. Objects that have just been allocated will be placed in the region of the heap named "generation0", while long-lived objects will be promoted to a region named "generation1". Other partitions are possible; for example, we could partition the heap by size: objects larger than a certain threshold are allocated into a separate Large Object Heap (LOH), whereas smaller objects are placed in a Small Object Heap.

Focus on the generational garbage collector

As stated in The Garbage Collection Handbook, the hypothesis that most objects die young appears to be widely valid, regardless of programming paradigm or implementation language. This provides the opportunity to partition the heap by age, as illustrated in the figure below.

If a collection becomes necessary, the entire heap will not be scanned indiscriminately. Instead, the focus will be on the portion of the heap most likely to contain a low rate of survivors (generation0 or objects in red in the picture above). This strategic approach enables us to avoid collecting the entire heap, thereby enhancing performance and minimizing pause times attributable to garbage collection.

Implementing the Allocate method

If there is insufficient memory, the garbage collector attempts to reclaim memory only for generation0 and then retries the allocation process. If there is still not enough space, a collection for generation1 is achieved. If there is still not enough space, an exception is raised.

public override int Allocate(string field)
{
    var r = AllocateHeap(field);
    if (r == -1)
    {
        Collect("gen0");
        r = AllocateHeap(field);

        if (r == -1)
        {
            Collect("gen1");
            r = AllocateHeap(field);
        }
    }

    if (r == -1) throw new InsufficientMemoryException();

    return r;
}
private int AllocateHeap(string field)
{
    var size = _heap.Size;
    var cells = field.ToArray();
    var requiredSize = cells.Length;

    var begin = _currentPointerInTheHeap;

    // Do not continue if there is not enough space.
    if ((size - begin) < requiredSize) return -1;

    for (var i = begin; i < begin + requiredSize; i++)
    {
        _heap.Cells[i].Cell = cells[i - begin];
    }

    var pointer = new RuntimeHeapPointer(begin, requiredSize);
    _heap.Pointers.Add(pointer);

    _currentPointerInTheHeap = _currentPointerInTheHeap + requiredSize;

    return begin;
}

Implementing the Collect method

The Collect method serves as the core of our algorithm, gathering memory from a specific portion of the heap based on the provided argument.

public override void Collect(string partition)
{
    if (partition == "gen0") CollectGen0();
    if (partition == "gen1") CollectGen1();
}
private void CollectGen0()
{
    MarkFromRoots("gen0");
    CompactGen0();
}

private void CollectGen1()
{
    MarkFromRoots("gen1");
    CompactGen1();
}

Here, we employ the mark-and-compact algorithm for both generations, but it is entirely feasible to implement a distinct one for each (for instance, using mark-and-sweep for generation1 and mark-and-compact for generation0).

The MarkFromRoots method closely resembles the one previously presented, making it unnecessary to reproduce it here. Our focus will be on highlighting the CompactGen1 method.

Heap

Merely marking and compacting objects in generation1 does not suffice for compacting the entire heap. It is imperative to additionally adjust pointers within generation0. This particularity of the method necessitates careful consideration.

private void CompactGen1()
{
    var cells = _heap.Cells;
    var pointers = _heap.Pointers.Where(x => x.Metadata == "gen1").ToList();
    foreach (var pointer in pointers.Where(x => !x.IsMarked))
    {
        var address = pointer.StartCellIndex;
        for (var i = address; i < address + pointer.AllocationSize; i++)
        {
            cells[i].Cell = '\0';
        }
    }

    var list = new List<RuntimeHeapPointer>();
    foreach (var pointer in pointers.Where(x => x.IsMarked))
    {
        pointer.SetMarked(false);
        list.Add(pointer);
    }

    var offset = 0;
    foreach (var pointer in list)
    {
        var begin = pointer.StartCellIndex;
        if (begin == 0)
        {
            offset = offset + pointer.AllocationSize;
            continue;
        }

        // Update roots in the stack
        // This code is perfectible.
        var thread = _threads.FirstOrDefault(x => x.Roots.Any(t => t.Key == begin));
        var root = thread.Roots.FirstOrDefault(x => x.Key == begin);
        root.Value.StartIndexInTheHeap = offset;
        thread.Roots[offset] = root.Value; thread.Roots.Remove(begin);

        // Update pointers in the heap
        pointer.StartCellIndex = offset;
        for (var i = begin; i < begin + pointer.AllocationSize; i++)
        {
            cells[offset + i - begin].Cell = cells[i].Cell;
            cells[i].Cell = '\0';

        }

        offset = offset + pointer.AllocationSize;
    }

    var listGen0 = _heap.Pointers.Where(x => x.Metadata == "gen0").ToList();
    var offsetGen0 = offset;
    foreach (var pointer in listGen0)
    {
        var begin = pointer.StartCellIndex;
        if (begin == 0)
        {
            offsetGen0 = offsetGen0 + pointer.AllocationSize;
            continue;
        }

        // Update roots in the stack
        // This code is perfectible.
        var thread = _threads.FirstOrDefault(x => x.Roots.Any(t => t.Key == begin));
        var root = thread.Roots.FirstOrDefault(x => x.Key == begin);
        root.Value.StartIndexInTheHeap = offsetGen0;
        thread.Roots[offsetGen0] = root.Value; thread.Roots.Remove(begin);

        // Update pointers in the heap
        pointer.StartCellIndex = offsetGen0;
        for (var i = begin; i < begin + pointer.AllocationSize; i++)
        {
            cells[offsetGen0 + i - begin].Cell = cells[i].Cell;
            cells[i].Cell = '\0';

        }

        offsetGen0 = offsetGen0 + pointer.AllocationSize;
    }

    _heap.Pointers = list.Concat(listGen0).ToList();
    _startAddressForGen0 = offsetGen0;
    _currentPointerInTheHeap = offsetGen0;
}

We can observe that the implementation of the Collect method is significantly more complex than its counterparts in other algorithms. This is the distinctive feature of heap partitioning: it offers improved performance but comes at the cost of increased coding complexity.

Running the programming

We employ a specific set of instructions to observe the garbage collector in action. Specifically, we initiate the collection of generation1.

thread1;CREATE_THREAD;
thread1;PUSH_ON_STACK;Jubilant
thread2;CREATE_THREAD;
thread1;PUSH_ON_STACK;Radiant
thread1;POP_FROM_STACK;
thread2;PUSH_ON_STACK;Harmony
thread1;PUSH_ON_STACK;Frenzy
thread1;PUSH_ON_STACK;Luminous
thread1;PUSH_ON_STACK;So
thread1;POP_FROM_STACK;
thread2;PUSH_ON_STACK;Serendipity
thread1;PUSH_ON_STACK;Enigmatic
thread2;POP_FROM_STACK;
thread1;PUSH_ON_STACK;Cascade
thread2;POP_FROM_STACK;
thread2;PUSH_ON_STACK;GarbageCollector
thread1;POP_FROM_STACK;
thread1;POP_FROM_STACK;
thread2;PUSH_ON_STACK;Three
thread1;POP_FROM_STACK;
thread1;POP_FROM_STACK;
thread2;PUSH_ON_STACK;GenerationalGarbageCollector

In fact, almost all the objects in thread1 are deallocated. As there were long-lived objects among them, they will be promoted to generation1 and will subsequently need to be reclaimed.

In this particular case, Harmony existed in a heap almost from the beginning of the program. It is reclaimed at the end of the process when we attempt to allocate a very long word (GenerationalGarbageCollection). Once again, the key point here is that during the collections initiated by the program, only a portion of the heap was scanned.

What about the .NET framework?

  • The .NET garbage collector uses a generational approach, dividing objects into three generations (0, 1, and 2) based on their lifetime. Younger objects (in generation 0) are collected more frequently, while older objects (in generations 1 and 2) are collected less frequently. It employs a mark-and-compact algorithm to identify and collect unreachable objects.
  • The heap is further divided into two main areas: the Small Object Heap (SOH) and the Large Object Heap (LOH). These divisions are made to optimize memory management for different types of objects.

Small Object Heap (SOH)

  • The Small Object Heap is designed to manage small and short-lived objects efficiently. Objects with a size less than or equal to 85,000 bytes are allocated in the SOH. The SOH is further divided into three generations (0, 1, and 2) based on the age of the objects.
  • The garbage collector collects Generation 0 more frequently than older generations, as many short-lived objects do not survive for long.

Large Object Heap (LOH)

  • The Large Object Heap is specifically designated for larger objects, typically those with a size greater than 85,000 bytes. Examples include large arrays or data structures.
  • Unlike the Small Object Heap, the Large Object Heap does not have a generational division. All objects in the LOH are treated as a single generation. So, because large objects tend to have longer lifetimes, garbage collections in the LOH are less frequent. When a collection occurs, it may involve pausing the application for a longer duration compared to collections in the SOH.

Objects in the SOH benefit from the generational approach, which optimizes the collection process for short-lived objects. The LOH is suitable for objects that are expected to have a longer lifespan or are inherently large. The division between the SOH and LOH helps to balance the efficiency of garbage collection for different types of objects and their usage patterns.

Final thoughts

In this article, our aim was to provide a comprehensible overview of what a garbage collector is and how it can be implemented progressively. In reality, designing a garbage collector is a nuanced art that requires a profound understanding of various domains within computer science. This post has presented only the fundamental guidelines employed in modern collectors, overlooking many subtleties. The intent was to be didactic and offer a broad understanding of the topic.

Do not hesitate to visit my website should you require further information.


Similar Articles