Multithreading Made Easy in .NET 2.0

Introduction

The code in this article demonstrates an alternative way to do multithreaded programming using .NET 2.0 and a freely available library called CSP.NET. The application presented is meant to introduce the programmer to the most basic elements of CSP.NET. It is a console application that lets the user input a file search pattern and a word. The application then searches the files matching the pattern for the word entered by the user - if there are any matches they are recorded and written to a text file. All the tasks are done concurrently meaning that user input, word search and storing of the results can run simultaneously. This simultaneous execution is of course just simulated on a single core machine but a reality if more processors are present.

Traditional concurrent programming with multiple threads

If no external libraries are used the only way to write concurrent programs in .NET is through the use of the Thread class - or the ThreadPool. This class offers a lot of methods used to synchronize one thread with other threads, put the current thread in a sleep state, resume from the sleep state, abort the current thread and so on.

In the real world different parts of a program need to communicate with each other and in the world of threads that is typically done through shared data. Access to shared data has to be controlled carefully to avoid that more than one thread modifies the data at the same time. To that end most modern languages offer a set of features to control access to shared data and to signal between threads. In c# these features are ,among others, made up of locks, mutexes, semaphores, waithandles and monitors.

Just looking at the number of methods and classes designed to support thread programming it is evident that multithreaded programs can get very complex very easily. With complexity comes the risk of introducing errors, and as a matter of fact thread programming is often regarded as one of the most challenging areas in programming. That may have been all right when all computers only had one core and one processor and multithreaded programming was the realm of a select few. But with the advent of multicore processors, programs need to be multithreaded to exploit the power of modern computers.

Concurrent programming the CSP way

What if there was a way to write multithreaded programs where you did not have to work with threads or any of the supporting features like the ones mentioned above. What if all you had to do was to write the different parts of the concurrent program as normal single threaded programs communicating through a series of predefined channels.

That way of thinking about concurrency is one of the key elements in a model known as CSP (Communicating Sequential Processes). If you want to use CSP in .NET you have two choices.

The first is to read the specification and implement your own library containing the CSP constructs needed in your programs.
If that seems a bit too much there is a library for .NET 2.0 called CSP.NET that can be downloaded free of charge, which implements all the common CSP constructs needed. As this article is meant as a very basic introduction to multithreaded programming with CSP.NET only the essential elements of CSP.NET are discussed - the more advanced features are the subject of another article.

In CSP.NET a multithreaded application is made of a number of sequential programs communicating through channels all executing concurrently. These sequential programs are called processes and are of course not separate programs but rather classes implementing the ICSProcess interface. This interface has one method called Run that needs to be implemented. Think of the Run method as a normal thread and put the code you want that thread to contain here.

It most programs communication between threads and thus also between different classes implementing the ICSProcess interface is required. In CSP.NET communication between ICSPprocesses is done through a set of predefined channels. The channels are called One2OneChannel, Any2OneChannel, One2AnyChannel and Any2AnyChannel. They are used to send data from one ICSProcess to another. They are all generic meaning that data of any type can be send through a channel. It is important to note that CSP.NET channels are unidirectional meaning that data can only travel in one direction. That means that a ICSProcess can either read from or write to a given channel but not both.

A One2OneChannel is used to communicate between two ICSProcesses - one process reads and the other one writes.
An Any2OneChannel is used when on of multiple ICSProcesses wants to write to the same ICSProcess.
A One2AnyChannel is used when one ICSProcess wants to write to one of multiple ICSProcesses.
An Any2AnyChannel is used when one of multiple ICSProcesses wants to write to one of multiple receiving processes.

The channels are not broadcast channels, meaning that even though you use a One2AnyChannel only the first process reading from the channel receives the data.

By default all the channels are blocking, which means that an ICSProcess that tries to read from a channel will block if no other process has written some data to the channel. Likewise an ICSProcess trying to write from a channel will block until another process wants to read the data from the channel.

This behavior is appropriate in many cases but there are situations in which it would be better if the ICSProcess writing to the channel could continue execution even though no process is waiting to read from the channel. In cases like that CSP.NET offers buffered channels that can hold a certain amount of data. When the buffer is full they behave like the normal channels. If the buffer is not full they allow the process writing to the channel, to continue execution even though no processes are waiting to read the data. In the example application both buffered and unbuffered channels are demonstrated.

When a set of ICSProcesses have been defined and appropriate channels set up and connected all that remains in order to execute the program is to start the processes to execute in parallel. In CSP.NET that is done through the Parallel class. The Parallel class takes an array of ICSProcesses and when the Run method on the Parallel class is invoked all the ICSProcesses are executed concurrently. Only when all the processes have terminated does the Run method return.

In this section I have only scratched the surface of CSP and CSP.NET. Remember that underneath the nice surface CSP.NET still uses all the thread stuff in NET 2.0. Using CSP.NET any programmer can write nice high performance applications exploiting the power of multiple processors and cores without explicitly having to deal with monitors, semaphores, locks and synchronization of threads.

Installing CSP.NET

Installation of CSP.NET is quite easy. Go to the website: www.cspdotnet.com. Click on "Downloads" and download the CSP.NET Library. Run the installer and that's it.

To use the library in your own projects, just add a reference to the Csp.dll just installed and remember to import the namespace into your source files.

Example program

wordfinder.gif

To illustrate the elements of CSP.NET discussed above, a small example application is shown below. The application is so simple that all the code is included in the article. Despite its simplicity, the program is multithreaded and designed to scale with the number of CPU-cores in the system.

The program is a console based app that takes two inputs from the user. The first is a file pattern, possibly including wildcards and the second is a word to search for. When the user has entered a file pattern and a word the program searches all files matching the file pattern for the word entered by the user. All matches are written to a text file named "results.txt".

If the program was written as a normal single threaded app the user interface would block until all matching files had been searched, meaning that the user interface would not respond to further input. Also the program would not take into account the number of CPU-cores in the machine and perform the same even though you had just bought the latest monster machine with two processors each containing two cores.

The program is divided into three logical parts each performing a specific task. These parts a naturally defined as classes implementing the ICSProcess interface communicating though CSP.NET channels. The first part of the program implements the user interface. The second part does the actual work and searches through files for a specific word. The third part writes the results to a text file.

public struct SearchData

{

          public string filePattern;

          public string searchWord;

 

          public SearchData(string pattern, string word)

          {

                   filePattern = pattern;

                   searchWord = word;

          }       

}

The program uses a SearchData struct to hold the file pattern and the word entered by the user. SearchData is shown above and needs no explanations. The code for the user interface class looks like this:

public class UI : ICSProcess

{

    IChannelOut searchDataChannel;

 

    public UI(IChannelOut searchDataChannel)

    {

        this.searchDataChannel = searchDataChannel;

    }

 

    public void Run()

    {

        while (true)

        {

            string filePattern, searchWord;

            Console.Write("Enter file search pattern: ");

            filePattern = Console.ReadLine();

            Console.Write("Enter word to search for: ");

            searchWord = Console.ReadLine();

            searchDataChannel.Write(new SearchData(filePattern, searchWord));

        }

    }

}

As explained in the text above the class needs to implement the ICSProcess interface, which means that the Run method has to be implemented. The Constructor of the UI class takes one parameter of the type: IChannelOut.

IChannelOut is an interface implemented by all the channels in CSP.NET. It means that the channel is restricted to provide the write functionality of a CSP.NET channel. There is also defined an IChannelIn that only provides the reading functionality of the CSP.NET channels. By using these interfaces you avoid the accidental use of a channel as both a writer and a reader in the same process. The channel used in the UIprocess is a channel that only supports writing objects of the type SearchData.

The Run method is very simple. It reads two strings from the command line. The first one is the file patterns and the second one the word to search for. When the two strings have been entered by the user they are written to the searchDataChannel as a SearchData object. As all the code in the Run methods in enclosed in an infinite while loop it starts over by asking the user for a new file pattern and word.

The ICSProcess that receives the SearchData from the UI process is shown next.

public class WordFinder : ICSProcess

{

    IChannelIn searchDataChannel;

    IChannelOut fileWriterChannel;

    //Search only files below this directory...

    const string path = @"c:\testdir\";

 

    public WordFinder(IChannelIn searchDataChannel, IChannelOut fileWriterChannel)

    {

        this.searchDataChannel = searchDataChannel;

        this.fileWriterChannel = fileWriterChannel;

    }

 

    public void Run()

    {

        while (true)

        {

            SearchData sd = searchDataChannel.Read();

            string[] files = Directory.GetFiles(path, sd.filePattern);

            for (int i = 0; i < files.Length; i++)

            {

                using (StreamReader sr = new StreamReader(files[i]))

                {

                    string line;

                    int linecount = 0;

                    StringBuilder sb = new StringBuilder();

                    sb.AppendLine("Searching " + files[i] + " for searchword: " + sd.searchWord);

                    while ((line = sr.ReadLine()) != null)

                    {

                        if (line.Contains(sd.searchWord))

                            sb.AppendLine(sd.searchWord + " found at line: " + linecount);

                        linecount++;

                    }

                }

            }

            fileWriterChannel.Write(sb.ToString());

        }

    }

}

WordFinder takes the user input and searches through all the files matching the file pattern for the specified word. The constructor takes two CSP.NET channels. The first one called searchDataChannel is defines as an input-channel meaning that it can only be used to read data from a channel - in this case it can read data of the type SearchData. The second channel, called fileWriterChannel, is an output-channel that can write the data type string.

As with the UI process all the actual code in the Run method is implemented inside an infinite while loop. That means that the program will continue running until it is terminated explicitly by closing the program. From a CSP.NET point of view only two lines of code are interesting.

The first line in the while loop reads SearchData from the searchDataChannel. When that is done all the files matching the file pattern given by the user are searched for the word specifies by the user. All matches are recorded using the StringBuilder. When all the files have been searched thestring built using the StringBuilder is written to the fileWriterChannel.

public class FileWriter : ICSProcess

{

    IChannelIn fileWriterChannel;

    const string file = @"c:\testdir\results.txt";

 

    public FileWriter(IChannelIn fileWriterChannel)

    {

        this.fileWriterChannel = fileWriterChannel;

    }

 

    public void Run()

    {

        while (true)

        {

            string fileData = fileWriterChannel.Read();

            using (StreamWriter sw = new StreamWriter(file, true))

            {

                sw.Write(fileData);

                sw.WriteLine();

            }

        }

    }

}

The FileWriter class takes an input-channel called fileWriterChannel and reads a string from the channel. This string is the one constructed in the WordFinder class listing all the matches of a word search. The string is appended to a text file and that's it. The code for the FileWriter class is listed above.

What remains in order to have a working CSP.NET program is to create objects of the three classes UI, WordFinder and FileWriter and connect them with CSP.NET channels. All that is done in the Main method shown below.

static void Main(string[] args)

{     

    CspManager.InitStandAlone();

    Any2OneChannel fileWriterChannel = Factory.GetAny2One();

    One2AnyChannel searchDataChannel = Factory.GetOne2Any(new FifoBuffer(10));

    ICSProcess[] processes = new ICSProcess[Environment.ProcessorCount + 2];

    processes[0] = new UI(searchDataChannel);       

    processes[1] = new FileWriter(fileWriterChannel);

    for (int i = 0; i < Environment.ProcessorCount; i++)

        processes[i+2] = new WordFinder(searchDataChannel, fileWriterChannel);

    Parallel par = new Parallel(processes);

    par.Run();

}

 

The first line CspManager.InitStandAlone() initializes the CSP.NET library and tells it that we are working with a standard CSP.NET program with no distributed processes.

The next line creates a CSP.NET channel, fileWriterChannel , of the type Any2OneChannel. As described in the section about CSP.NET this channel can be connected to multiple processes wishing to write but only one process can read from the channel. The Factory methods are part of CSP.NET and are used to create all the various types of channels. The fileWriterChannel is constructed as a channel to transport strings.

The next line creates another channel called searchDataChannel. This channel is created as One2AnyChannel able to transport data of type SearchData. Remember from above that a One2AnyChannel can be connected to only one writing process but to multiple reading processes. Note that the searchDataChannel is created as a buffered channel, meaning that it doesn't block when a writing process tries to write without a reading process being ready. The buffer is created as a FiFoBuffer which is means that the elements are retrieved in the same order as they are written. The capacity of the buffer is set to 10 which means that at most 10 elements can be written to the buffer before it blocks - provided that a reading process has not read any elements before that.

Now it is time to create the processes making up our CSP.NET program. Remember that I said that the program scaled with the number of available CPU-cores. As only the WordFinder process does some intensive work it only makes sense to increase the number of WordFinder processes in order to take advantage of multiple CPU-cores. First an ICSProcess array is created which can hold as many processes as there are cores plus 2. The first two processes are the UI and the FileWriter.

The same number of WordFinder processes as there are CPU-cores are created next. Regardless of the number of WordFinder processes created the same two channels are used. That is possible because the channels are created as a Any2OneChannel and One2AnyChannel respectively. Had we known that only one WordFinder process would ever be created we could have used two One2OneChannels instead.

When the processes to execute are created we only need to create an instance of the Parallel class defined in CSP.NET. The processes to run in parallel are defined in the constructor. The last line starts out concurrent program by calling the Run method on the Parallel class.

Conclusion

This article demonstrates an alternative, and much easier way, to write multithreaded programs using a freely available library called CSP.NET. I have given a brief introduction to the most basic constructs and left out some very powerful features such as the possibility to write concurrent distributed programs with ease. The more advanced features will be the topic of another article. If you want to discuss CSP.NET, there is a forum on the CSP.NET website dedicated to discussing the library.