SIGN UP MEMBER LOGIN:    
ARTICLE

Enumerable, Enumerator, and Yielding a "Free" State Machine.

Posted by Matthew Cochran Articles | C# Language April 15, 2008
In this article we look at the power of the "yield" keyword in C#.
Reader Level:

Enumerator -- A State Machine.

A  .NET 1.0 enumerator is basically a state machine object that implements the following interface:

public interface IEnumerator
{
    object Current { get; }
    bool MoveNext();
    void Reset();
}

The current state of our object is accessed through the "Current" property.  If we want to get the next state in the machine we call "MoveNext()" and if the value was true, we have a "Current" state available.  If we want to start over from the beginning, we call "Reset()".

With .NET 2.0 and the arrival of generics, we have a more type-safe definition for a state machine which is basically the same thing that we had in .NET 1.0, but with a type-safe "Current" property and we have added the IDisposable interface.

public interface IEnumerator<T> : IDisposable, IEnumerator
{
    T Current { get; }
}

What do we mean by a state machine? 

It would be any object that would have multiple "states" that we want to view.  The following class is a state machine that produces 10 (and only 10) random integers.  If we try and get an 11th random integer, our state machine throws an exception.  Every call to "MoveNext()" generates a new number and increments the count of items received.

class RandomEnumerator : IEnumerator<Int32>
{
    public RandomEnumerator()
    {
        m_rand = new Random();
        m_Current = m_Count = 0;
    }

    private readonly Random
        m_rand;

    private Int32
        m_Current,
        m_Count;

    public int Current
    {
        get
        {
            if (m_Count > 10)
                throw new InvalidOperationException("Sorry... out of results");

            return m_Current;
        }
    }

    public void Dispose()
    {
        ; // nothing to do;
    }

    object System.Collections.IEnumerator.Current
    {
        get { return Current; }
    }

    public bool MoveNext()
    {
        if (m_Count < 10)
        {
            m_Current = m_rand.Next();
            m_Count++;
            return true;
        }
        else
            return false;
    }

    public void Reset()
    {
        m_Count = m_Current = 0;
    }
}

Now if we want to consume our random number generating state machine, we could do so with the following code.

IEnumerator<Int32> enumerator = new RandomEnumerator();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

Enumerable -- A State Machine Builder

Almost any collection is going to be able to build a state machine that will show us all the items in the collection.  (Such as  List<> Collection<>, Stack<>, Queue<>, etc)  This contract is expressed through the IEnumerable (in .NET 1.0 or a more type-safe IEnumerable<T>  in 2.0) which gives us a method to retrieve a state machine.

public interface IEnumerable
{
    IEnumerator GetEnumerator();
}

public interface IEnumerable<T> : IEnumerable
{
    IEnumerator<T> GetEnumerator();
}

This is such a common pattern, that it is integrated with the C# language such that we can easily iterator over all members of an object that produces a state machine.

For example, let's make a RandomGroup

class RandomGroup : IEnumerable<Int32>
{

    #region IEnumerable<int> Members

    public IEnumerator<int> GetEnumerator()
    {
        return new RandomEnumerator();
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator(); // get the IEnumerable<int> Enumerator
    }

    #endregion
}

Now, we could iterate through our state machine as before:

RandomGroup rGroup = new RandomGroup();

IEnumerator<Int32> enumerator = rGroup.GetEnumerator();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

But we can use a shorter syntax using a "for/each" loop with our IEnumerable<> state engine factory and we'll have the exact same behavior:

foreach (Int32 i in new RandomGroup())
    Console.WriteLine(i);

The "Free" State Engine

One of the really cool things in .NET 2.0 is the ability to have the compiler build the state engine for us.  We can take all the following code:

class RandomEnumerator : IEnumerator<Int32>
{
    public RandomEnumerator()
    {
        m_rand = new Random();
        m_Current = m_Count = 0;
    }

    private readonly Random
        m_rand;

    private Int32
        m_Current,
        m_Count;

    public int Current
    {
        get
        {
            if (m_Count > 10)
                throw new InvalidOperationException("Sorry... out of results");

            return m_Current;
        }
    }

    public void Dispose()
    {
        ; // nothing to do;
    }

    object System.Collections.IEnumerator.Current
    {
        get { return Current; }
    }

    public bool MoveNext()
    {
        if (m_Count < 10)
        {
            m_Current = m_rand.Next();
            m_Count++;
            return true;
        }
        else
            return false;
    }

    public void Reset()
    {
        m_Count = m_Current = 0;
    }
}

And the method to build our state engine.

static IEnumerator<Int32> GetRandomEnumerator()
{
    return new RandomEnumerator();
}

And use the "yield" keyword to have the C# compiler build the state engine behind the scenes.  We just specify how it should behave in the builder method.  So now we have much less code and one less class which means easier maintenance.  The following builder has the compiler generate a state engine class just as before.

static IEnumerator<Int32> GetRandomEnumerator2()
{
    Random rand = new Random();

    for (Int32 i = 0; i < 10; i++)
        yield return rand.Next();
}

This is TONS easier to maintain and we can consume the state engine just as before:

IEnumerator<Int32> enumerator = GetRandomEnumerator2();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

Evey time we call "MoveNext()" the state of our machine is changed to whatever we yield return from our method.  When we run out of "yields" the "MoveNext()" method returns false.

We can have multiple "yield return" statements to build our state engine as in the following code:

static IEnumerator<Int32> GetFirstSixPrimesEnumerator()
{
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 5;
    yield return 7;
    yield return 11;
}

Which produces the following resultes when consumed:

IEnumerator<Int32> enumerator = GetFirstSixPrimesEnumerator();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

1
2
3
5
7
11

Custom Iterators.

We can also use the yield statement to produce custom IEnumerator<> state machine objects as well as IEnumerable<> iterator objects.  If we change our method to return IEnumerable<> instead of IEnumerator<> we get slightly different behavior.

static IEnumerable<Int32> GetFirstSixPrimesEnumerable()
{
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 5;
    yield return 7;
    yield return 11;
}

Now we have the compiler generate not only the state engine class but the state engine builder class as well.  So we can consume our new compiler-generated class with the following code:

foreach (Int32 i in GetFirstSixPrimesEnumerator())
    Console.WriteLine(i);

Which will give us:

1
2
3
5
7
11

Ok, so that's really cool, but how else can I use it?

Besides a simplified way to implement IEnumerable<> and IEnumerator<>, we can use the "yield" statement to generate sets of numbers such as a Fibinocci sequence:

static IEnumerable<Int32> GetFibonacci(Int32 max)
{
    Int32
        x = 0,
        y = 1,
        temp;

    while (y < max)
    {
        yield return y;
        temp = y;
        y = x + y;
        x = temp;
    }
}

Or if we have a tree type structure as follows:

class Node<T>
{
    private Node<T>
        m_left,
        m_right;

    private T
        m_value;

    public T Value
    {
      get { return m_value; }
      set { m_value = value; }
    }

    internal Node<T> Right
    {
        get { return m_right; }
        set { m_right = value; }
    }

    internal Node<T> Left
    {
        get { return m_left; }
        set { m_left = value; }
    }
}

We can use the yield statement to implement recursive traversal through an iterator:

class Tree<T>: IEnumerable<Node<T>>
{
    private Node<T> m_topNode;

    internal Node<T> TopNode
    {
        get { return m_topNode; }
        set { m_topNode = value; }
    }

    private IEnumerable<Node<T>> GetChildren(Node<T> node)
    {
        if (null != node.Left)
            foreach (Node<T> leftNode in GetChildren(node.Left))
                yield return leftNode;

        yield return node;

        if (null != node.Right)
            foreach (Node<T> rightNode in GetChildren(node.Right))
                yield return rightNode;
    }

    #region IEnumerable<Node<T>> Members

    public IEnumerator<Node<T>> GetEnumerator()
    {
        return GetChildren(m_topNode);
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    #endregion
}

And we can build other creative auto-generated state engine structures such as Jeffery Richter's AsyncEnumerator (which is pure genius, by the way: http://msdn2.microsoft.com/en-us/magazine/cc163323.aspx).  I was truly blown ayway by the elegant code presented during his session on threading at the Devscovery event in NYC where he demonstrated his AsyncEnumerator and how it uses a compiler built state engine to help manage the complexities of asynchronous programming. I found out he is going to be writing more on the AsyncEnumerator in an upcoming article which can't wait to read.

Stopping the Yield

One thing to be aware of is that we can stop yielding at any time by issuing a "yield break" statement.  The following code will return a random number of 'x' characters:

static IEnumerable<Char> GetRandomNumberOfCharacters()
{
    Random r = new Random();

    if (r.NextDouble() > .9)
        yield break;
    else
        yield return 'x';
}

Anyways... As you can see, the possiblities of simplifying our code using the "yield" keyword to build cusom iterators and state machines is vast and very powerful.

Until next time,
Happy coding

Login to add your contents and source code to this article
share this article :
post comment
 
Team Foundation Server Hosting
Become a Sponsor
PREMIUM SPONSORS
  • Finally – a virtual platform that delivers next-generation Windows Server 2008 Hyper-V virtualization technology from a managed hosting partner you can truly depend on. Visit www.maximumasp.com/max for a FREE 30 day trial. Hurry offer ends soon. Climb aboard the MaxV platform and take advantage of High Availability, Intelligent Monitoring, Recurrent Backups, and Scalability – with no hassle or hidden fees. As a managed hosting partner focused solely on Microsoft technologies since 2000, MaximumASP is uniquely qualified to provide the superior support that our business is built on. Unparalleled expertise with Microsoft technologies lead to working directly with Microsoft as first to offer IIS 7 and SQL 2008 betas in a hosted environment; partnering in the Go Live Program for Hyper-V; and product co-launches built on WS 2008 with Hyper-V technology.
    ceTE software specializes in components for dynamic PDF generation and manipulation. The DynamicPDF™ product line allows you to dynamically generate PDF documents, merge PDF documents and new content to existing PDF documents from within your applications.
Become a Sponsor