FREE BOOK

Chapter 8: C# 4.0 Features

Posted by Addison Wesley Free Book | C# Language February 02, 2010
This chapter looks at the new features added into C# 4.0 that combine to improve code readability and extend your ability to leverage LINQ to Object queries over dynamic data sources.

Using Dynamic Types in LINQ Queries

Initially you might be disappointed to learn that dynamic types don't support LINQ. LINQ relies exclusively on extension methods to carry out each query expression operation. Extension methods cannot be resolved at runtime due to the lack of information in the compiled assembly. Extension methods are introduced into scope by adding the assembly containing the extension into scope via a using clause. This is available at compile time for method resolution, but it's not available at runtime-hence no LINQ support. However, this only means you can't define collection types as dynamic, but you can use dynamic types at the instance level (the types in the collections being queried), as you will see in the following example.

For this example we create a type that allows comma delimited text files to be read and queried in an elegant way, often useful when importing data from another application. By "elegant" I mean not hard-coding any column name definitions into string literals in our importing code, but rather, allowing direct access to fields just like they are traditional property accessors. This type of interface is often called a fluent interface. Given the following sample CSV file content, the intention is to allow coders to directly reference the data columns in each row by their relevant header names defined in the first row, that is FirstName, LastName, and State.

FirstName,LastName,State
Troy,Magennis,TX
Janet,Doherty,WA

The first row contains the column names for each row of the file, and this particular implementation expects this to always be the case. When writing LINQ queries against files of this format, referring to each row value in a column by the header name makes for easily comprehensible queries. Code similar to the following example is the goal, and this code compiling without a specific backing class from every file combination required reading (think of a dynamic anonymous type for the given input file header definition) is desirable:

var q = from line in new CsvParser(content)
        where line.State == "WA"

        select line.LastName;

Dynamic typing enables us to do just that and with remarkably little code. The tradeoff is that any property name access isn't tested for type safety or existence during compile time (the first time you will see an error is at runtime, hopefully not in production). To fulfill the requirement of not wanting a backing class for each specific file, the line type shown previously must be of type dynamic. This is necessary to avoid the compile-time error that would be otherwise reported.

To create our new dynamic type, we need our type to implement IDynamicMetaObjectProvider, and Microsoft has supplied a starting point in the System.Dynamic.DynamicObject type. This type has virtual implementations of the required methods that allow a dynamic type to be built and allows the implementer to just override the specific methods needed for a given purpose. In this case, we need to override the TryGetMember method, which will be called whenever code tries to read a property member on an instance of this type. We will process each of these calls by returning the correct text out of the CSV file content for this line, based on the index position of the passed-in property name and the header position we read in as the first line of the file.

Listing 8-4 shows the basic code for this dynamic type. All the error handling code has been removed for clarity, but the essential aspects of allowing dynamic lookup of individual CSV fields within a line as simple property access calls, for example line.State, remain. The property name is passed to the TryGetMember method in the binder argument, and can be retrieved by binder.Name.

Listing 8-4: Class to represent a dynamic type that will allow the LINQ code (or any other code) to parse a single comma-separated line and access data at runtime based on the names in the header row of the text file

public class CsvLine : System.Dynamic.DynamicObject
{
    string[] _lineContent;
    List<string> _headers;
    public CsvLine(string line, List<string> headers)
    {
        this._lineContent = line.Split(',');
        this._headers = headers;
    }
    public override bool TryGetMember(
        GetMemberBinder binder,
        out object result )
    {
        result = null;
        // find the index position and get the value
        int index = _headers.IndexOf(binder.Name);
        if (index >= 0 && index < _lineContent.Length)
        {
            result = _lineContent[index];
            return true;
        }
        return false;
    }

}

To put in the plumbing required for parsing the first row, a second type is needed to manage this process, which is shown in Listing 8-5, and is called CsvParser. This is in place to determine the column headers to be used for access in each line after that and also the IEnumerable implementation that will furnish each line (except the header line that contains the column names).

The constructor of the CsvParser type takes the CSV file content as a string and parses it into a string array of individual lines. The first row (as is assumed in this implementation, you might want to add code to check for that in your code) contains the column header names, and this is parsed into a List<string> so that the index positions of these column names can be subsequently used in the CsvLine type to find the correct column index position of that value in the data line being read. The GetEnumerator method simply skips the first line and then constructs a dynamic type CsvLine for each line after that until all lines have been enumerated.

Listing 8-5: The IEnumerable class that reads the header line and returns each line in the content as an instance of our CsvLine dynamic type

public class CsvParser : IEnumerable
{
   List<string> _headers;
   string[] _lines;
   public CsvParser(string csvContent)
   {
       _lines = csvContent.Split('\n');
       // grab the header row and remember positions
       if (_lines.Length > 0)
           _headers = _lines[0].Split(',').ToList();
   }
   public IEnumerator GetEnumerator()
   {
       // skip the header line
       bool header = true;
       foreach (var line in _lines)
           if (header)
               header = false;
           else
               yield return new CsvLine(line, _headers);
   }

}

The resulting calling code that satisfies our goals can be seen in Listing 8-6. The important aspects of this example are the dynamic keyword in the from clause and the ability to directly access just a properties the State, FirstName and LastName from an instance of our CsvLine dynamic type. Even though there is no explicit backing type for those properties, they are mapped from the header row in the CSV file itself. This code will only compile in C#4.0, and its output is all of the rows (in this case just one) that have a value of "WA" in the third column position:

Doherty, Janet (WA)

Listing 8-6:
Sample LINQ query code that demonstrates how to use dynamic types in order to improve code readability and to avoid the need for strict backing classes

string content =
    "FirstName,LastName,State\n
    Troy,Magennis,TX\n
    Janet,Doherty,WA";

var
q = from dynamic c in new CsvParser(content)
        where c.State == "WA"
        select c;

foreach
(var c in q)
{
    Console.WriteLine("{0}, {1} ({2})",
        c.LastName,
        c.FirstName,
        c.State);

}

Note

The sample code available on the HookedOnLINQ.com website contains a copy of this code that also allows for index access in addition to direct column name access through properties. This sample would require much more error handling and additional features like delimiter support before it is production-worthy.

As this example has shown, it is possible to mix dynamic types with LINQ. The key point to remember is that the actual element types can be dynamic, but not the collection being queried. In this case, we built a simple enumerator that reads the CSV file and returns an instance of our dynamic type. Any CSV file, as long as the first row contains legal column names (no spaces or special characters that C# can't resolve as a property name), it can be coded against just as if a backing class containing our columns names was created by code.

Total Pages : 11 678910

comments