Enumerable, Enumerator, and Yielding a "Free" State Machine

In this article, we look at the power of the yield keyword in C#. Enumerable, Enumerator, and Yielding a Free State Machine

Enumerator -- A State Machine.
 
A  .NET 1.0 enumerator is basically a state machine object that implements the following interface:
  1. public interface IEnumerator  
  2. {  
  3.     object Current { get; }  
  4.     bool MoveNext();  
  5.     void Reset();  
  6. } 
The current state of our object is accessed through the "Current" property.  If we want to get the next state in the machine we call "MoveNext()" and if the value was true, we have a "Current" state available.  If we want to start over from the beginning, we call "Reset()".
With .NET 2.0 and the arrival of generics, we have a more type-safe definition for a state machine which is basically the same thing that we had in .NET 1.0, but with a type-safe "Current" property and we have added the IDisposable interface.
  1. public interface IEnumerator<T> : IDisposable, IEnumerator  
  2. {  
  3.     T Current { get; }  
  4. } 
What do we mean by a state machine? 
 
It would be any object that would have multiple "states" that we want to view.  The following class is a state machine that produces 10 (and only 10) random integers.  If we try and get an 11th random integer, our state machine throws an exception.  Every call to "MoveNext()" generates a new number and increments the count of items received.
  1. class RandomEnumerator : IEnumerator<Int32>  
  2. {  
  3.     public RandomEnumerator()  
  4.     {  
  5.         m_rand = new Random();  
  6.         m_Current = m_Count = 0;  
  7.     }  
  8.   
  9.     private readonly Random  
  10.         m_rand;  
  11.   
  12.     private Int32   
  13.         m_Current,  
  14.         m_Count;  
  15.   
  16.     public int Current  
  17.     {  
  18.         get   
  19.         {  
  20.             if (m_Count > 10)  
  21.                 throw new InvalidOperationException("Sorry... out of results");  
  22.   
  23.             return m_Current;   
  24.         }  
  25.     }  
  26.   
  27.     public void Dispose()  
  28.     {  
  29.         ; // nothing to do;  
  30.     }  
  31.   
  32.     object System.Collections.IEnumerator.Current  
  33.     {  
  34.         get { return Current; }  
  35.     }  
  36.   
  37.     public bool MoveNext()  
  38.     {  
  39.         if (m_Count < 10)  
  40.         {  
  41.             m_Current = m_rand.Next();  
  42.             m_Count++;  
  43.             return true;  
  44.         }  
  45.         else  
  46.             return false;  
  47.     }  
  48.   
  49.     public void Reset()  
  50.     {  
  51.         m_Count = m_Current = 0;  
  52.     }  
  53. } 
Now if we want to consume our random number generating state machine, we could do so with the following code.
  1. IEnumerator<Int32> enumerator = new RandomEnumerator();  
  2.   
  3. while (enumerator.MoveNext())  
  4.     Console.WriteLine(enumerator.Current); 
Enumerable -- A State Machine Builder
 
Almost any collection is going to be able to build a state machine that will show us all the items in the collection.  (Such as  List<> Collection<>, Stack<>, Queue<>, etc)  This contract is expressed through the IEnumerable (in .NET 1.0 or a more type-safe IEnumerable<T>  in 2.0) which gives us a method to retrieve a state machine.
  1. public interface IEnumerable  
  2. {  
  3.     IEnumerator GetEnumerator();  
  4. }  
  5.   
  6. public interface IEnumerable<T> : IEnumerable  
  7. {  
  8.     IEnumerator<T> GetEnumerator();  
  9. } 
This is such a common pattern, that it is integrated with the C# language such that we can easily iterator over all members of an object that produces a state machine.
 
For example, let's make a RandomGroup
  1. class RandomGroup : IEnumerable<Int32>  
  2. {  
  3.  
  4.     #region IEnumerable<int> Members  
  5.   
  6.     public IEnumerator<int> GetEnumerator()  
  7.     {  
  8.         return new RandomEnumerator();  
  9.     }  
  10.  
  11.     #endregion  
  12.  
  13.     #region IEnumerable Members  
  14.   
  15.     System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()  
  16.     {  
  17.         return GetEnumerator(); // get the IEnumerable<int> Enumerator  
  18.     }  
  19.  
  20.     #endregion  
  21. } 
Now, we could iterate through our state machine as before:
  1. RandomGroup rGroup = new RandomGroup();  
  2.   
  3. IEnumerator<Int32> enumerator = rGroup.GetEnumerator();  
  4.   
  5. while (enumerator.MoveNext())  
  6.     Console.WriteLine(enumerator.Current); 
But we can use a shorter syntax using a "for/each" loop with our IEnumerable<> state engine factory and we'll have the exact same behavior:
  1. foreach (Int32 i in new RandomGroup())  
  2.     Console.WriteLine(i); 
The "Free" State Engine
 
One of the really cool things in .NET 2.0 is the ability to have the compiler build the state engine for us.  We can take all the following code:
  1. class RandomEnumerator : IEnumerator<Int32>  
  2. {  
  3.     public RandomEnumerator()  
  4.     {  
  5.         m_rand = new Random();  
  6.         m_Current = m_Count = 0;  
  7.     }  
  8.   
  9.     private readonly Random  
  10.         m_rand;  
  11.   
  12.     private Int32   
  13.         m_Current,  
  14.         m_Count;  
  15.   
  16.     public int Current  
  17.     {  
  18.         get   
  19.         {  
  20.             if (m_Count > 10)  
  21.                 throw new InvalidOperationException("Sorry... out of results");  
  22.   
  23.             return m_Current;   
  24.         }  
  25.     }  
  26.   
  27.     public void Dispose()  
  28.     {  
  29.         ; // nothing to do;  
  30.     }  
  31.   
  32.     object System.Collections.IEnumerator.Current  
  33.     {  
  34.         get { return Current; }  
  35.     }  
  36.   
  37.     public bool MoveNext()  
  38.     {  
  39.         if (m_Count < 10)  
  40.         {  
  41.             m_Current = m_rand.Next();  
  42.             m_Count++;  
  43.             return true;  
  44.         }  
  45.         else  
  46.             return false;  
  47.     }  
  48.   
  49.     public void Reset()  
  50.     {  
  51.         m_Count = m_Current = 0;  
  52.     }  
  53. } 
And the method to build our state engine.
  1. static IEnumerator<Int32> GetRandomEnumerator()  
  2. {  
  3.     return new RandomEnumerator();  
  4. } 
And use the "yield" keyword to have the C# compiler build the state engine behind the scenes.  We just specify how it should behave in the builder method.  So now we have much less code and one less class which means easier maintenance.  The following builder has the compiler generate a state engine class just as before.
  1. static IEnumerator<Int32> GetRandomEnumerator2()  
  2. {  
  3.     Random rand = new Random();  
  4.   
  5.     for (Int32 i = 0; i < 10; i++)  
  6.         yield return rand.Next();  
  7. } 
This is TONS easier to maintain and we can consume the state engine just as before:
  1. IEnumerator<Int32> enumerator = GetRandomEnumerator2();  
  2.   
  3. while (enumerator.MoveNext())  
  4.     Console.WriteLine(enumerator.Current); 
Every time we call "MoveNext()" the state of our machine is changed to whatever we yield return from our method.  When we run out of "yields" the "MoveNext()" method returns false.
 
We can have multiple "yield return" statements to build our state engine as in the following code:
  1. static IEnumerator<Int32> GetFirstSixPrimesEnumerator()  
  2. {  
  3.     yield return 1;  
  4.     yield return 2;  
  5.     yield return 3;  
  6.     yield return 5;  
  7.     yield return 7;  
  8.     yield return 11;  
  9. } 
Which produces the following results when consumed:
  1. IEnumerator<Int32> enumerator = GetFirstSixPrimesEnumerator();  
  2.   
  3. while (enumerator.MoveNext())  
  4.     Console.WriteLine(enumerator.Current); 
1
2
3
5
7
11
 
Custom Iterators 
 
We can also use the yield statement to produce custom IEnumerator<> state machine objects as well as IEnumerable<> iterator objects.  If we change our method to return IEnumerable<> instead of IEnumerator<> we get slightly different behavior.
  1. static IEnumerable<Int32> GetFirstSixPrimesEnumerable()  
  2. {  
  3.     yield return 1;  
  4.     yield return 2;  
  5.     yield return 3;  
  6.     yield return 5;  
  7.     yield return 7;  
  8.     yield return 11;  
  9. } 
Now we have the compiler generate not only the state engine class but the state engine builder class as well.  So we can consume our new compiler-generated class with the following code:
  1. foreach (Int32 i in GetFirstSixPrimesEnumerator())  
  2.     Console.WriteLine(i); 
Which will give us:
 
1
2
3
5
7
11
 
Ok, so that's really cool, but how else can I use it?
 
Besides a simplified way to implement IEnumerable<> and IEnumerator<>, we can use the "yield" statement to generate sets of numbers such as a Fibinocci sequence:
  1. static IEnumerable<Int32> GetFibonacci(Int32 max)  
  2. {  
  3.     Int32  
  4.         x = 0,  
  5.         y = 1,  
  6.         temp;  
  7.   
  8.     while (y < max)  
  9.     {  
  10.         yield return y;  
  11.         temp = y;  
  12.         y = x + y;  
  13.         x = temp;  
  14.     }  
  15. } 
Or if we have a tree type structure as follows:
  1. class Node<T>  
  2. {  
  3.     private Node<T>   
  4.         m_left,  
  5.         m_right;  
  6.   
  7.     private T   
  8.         m_value;  
  9.   
  10.     public T Value  
  11.     {  
  12.       get { return m_value; }  
  13.       set { m_value = value; }  
  14.     }  
  15.   
  16.     internal Node<T> Right  
  17.     {  
  18.         get { return m_right; }  
  19.         set { m_right = value; }  
  20.     }  
  21.   
  22.     internal Node<T> Left  
  23.     {  
  24.         get { return m_left; }  
  25.         set { m_left = value; }  
  26.     }  
  27. } 
We can use the yield statement to implement recursive traversal through an iterator:
  1. class Tree<T>: IEnumerable<Node<T>>  
  2. {  
  3.     private Node<T> m_topNode;  
  4.   
  5.     internal Node<T> TopNode  
  6.     {  
  7.         get { return m_topNode; }  
  8.         set { m_topNode = value; }  
  9.     }  
  10.   
  11.     private IEnumerable<Node<T>> GetChildren(Node<T> node)  
  12.     {  
  13.         if (null != node.Left)  
  14.             foreach (Node<T> leftNode in GetChildren(node.Left))  
  15.                 yield return leftNode;  
  16.   
  17.         yield return node;  
  18.   
  19.         if (null != node.Right)  
  20.             foreach (Node<T> rightNode in GetChildren(node.Right))  
  21.                 yield return rightNode;  
  22.     }  
  23.  
  24.     #region IEnumerable<Node<T>> Members  
  25.   
  26.     public IEnumerator<Node<T>> GetEnumerator()  
  27.     {  
  28.         return GetChildren(m_topNode);  
  29.     }  
  30.  
  31.     #endregion  
  32.  
  33.     #region IEnumerable Members  
  34.   
  35.     System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()  
  36.     {  
  37.         return GetEnumerator();  
  38.     }  
  39.  
  40.     #endregion  
  41. } 
And we can build other creative auto-generated state engine structures such as Jeffery Richter's AsyncEnumerator (which is pure genius, by the way: https://channel9.msdn.com/Blogs/Charles/Jeffrey-Richter-and-his-AsyncEnumerator).  I was truly blown away by the elegant code presented during his session on threading at the Devscovery event in NYC where he demonstrated his AsyncEnumerator and how it uses a compiler built state engine to help manage the complexities of asynchronous programming. I found out he is going to be writing more on the AsyncEnumerator in an upcoming article which can't wait to read.
 
Stopping the Yield
 
One thing to be aware of is that we can stop yielding at any time by issuing a "yield break" statement.  The following code will return a random number of 'x' characters:
  1. static IEnumerable<Char> GetRandomNumberOfCharacters()  
  2. {  
  3.     Random r = new Random();  
  4.   
  5.     if (r.NextDouble() > .9)  
  6.         yield break;  
  7.     else  
  8.         yield return 'x';  
  9. } 
Anyways... As you can see, the possibilities of simplifying our code using the "yield" keyword to build custom iterators and state machines is vast and very powerful.
 
Until next time,
Happy coding