Skip to main content

C# Advanced Tutorial - Advanced programming with C# [Advanced]


This tutorial aims to give a brief and advanced introduction into programming with C#. The prerequisites for understanding this tutorial are a working knowledge of programming, the C programming language and a little bit of basic mathematics. Some basic knowledge of C++ or Java could be helpful, but should not be required.

...

 

Events, asynchronous and dynamic programming, the TPL and reflection

This is the third part of a series of tutorials on C#. In this part we are going to discuss exciting features of C# like dynamic programming using the DLR or using the meta-data information known as reflection. We will also extend our knowledge on the .NET-Framework by getting to know the abstract Stream class, as well as the Windows Forms UI framework by diving into the event pattern. Finally we will also learn how to keep our application responsive by using asynchronous operations as well as multiple threads and tasks. Using the Task Parallel Library we will see how one can get optimal performance out of multi-core processors.

For further reading a list of references will be given in the end. The references provide a deeper look at some of the topics discussed in this tutorial.

 

Events

In the previous tutorial we already started with Windows Forms development. A crucial concept in UI development is the running message loop. This loop connects our application to the operating system. The key question is how we can respond to certain messages in this loop. Of course the answer to that question is the concept of events.

We've already seen that we can store pointers to arbitrary functions in so-called delegates. A delegate type is defined by a name, a return type and a list of parameters, i.e. their types and names. This concept makes referencing methods easy and reliable. The concept of an event is quite closely related. Let's start with an example that does not use events, but goes into the direction of a message loop communication with external code:
static void Main()
{
    Application.callback = () => Console.WriteLine("Number hit");
    Application.Run();
}

static class Application
{
    public static Action callback;

    public static void Run()
    {
        Random r = new Random(14);

        while(true)
        {
            double p = r.NextDouble();

            if(p < 0.0001 && callback != null)
                callback();
            else if(p > 0.9999)
                break;
        }
    }
}
 
What is the code doing? Nothing all too special, in fact we only created a new method called ApplicationRun, which has a permanent loop running. Now we have two special cases in there. In one case we want to finish the application (similar to when the user closes the program) and in the other we want to invoke an arbitrary piece of code.

In this sample code we choose a seed for the random number generator of 14. This is quite arbitrary. We only do this to get a reproducible result, that invokes the callback method more than once. The key question now is: How is this related to events?

An event is in fact a callback. However, there are a few (compiler-oriented) differences. The first difference is a language extension. Additionally to just using a delegate, we also need to use the keyword event. Once a delegate variable is marked as being an event, we cannot set it directly from outside the defining class. Instead we can only add or remove additional event handlers.

We can draw a scheme representing this relation:

The relation between event raiser and event handler.

Let's modify our code in two parts:
void Main()
{
    Application.callback += () => Console.WriteLine("Number hit");
    Application.Run();
}

static class Application
{
    public static event Action callback;

    public static void Run()
    {
        Random r = new Random(14);

        while(true)
        {
            double p = r.NextDouble();

            if(p < 0.0001 && callback != null)
                callback();
            else if(p > 0.9999)
                break;
        }
    }
}
 
Now we see that we need to use the self-add operator (+=) for adding an event handler. Removing an event handler is possible by using the self-subtract operator (-=). This is only possible if an event handler to the given method already exists. Otherwise nothing could be removed of course (this will not result into exceptions, but it could result in unexpected behavior, e.g. if one thinks he removes the actual handler, while removing something different that just matches the required signature).

Obviously we could use more handlers for the same event. So the following is also possible in our Main method:
Application.callback += () => Console.WriteLine("Number hit");
Application.callback += () => Console.WriteLine("Aha! Another callback");
Application.Run();
 
Now the two methods would be invoked on calling the delegate instance inside our class Application. How is that possible? The magic lies in two things.
  1. The compiler creates methods that will be called on using += and -= in combination with our defined event. The corresponding method will be called once we use the variable with one of those operators.
  2. The compiler uses the Combine method of the Delegate class to combine multiple delegates in one delegate by using +=. Additionally adding or removing handlers is thread-safe. The compiler will insert lock statements by using the CompareExchange instruction.

The outcome is quite nice for us. Using the keyword event, we can not only mark delegates as something special (as events to be precise), but the compiler also constructs additional helpers that become quite handy.

We will see later on that while adding or removing event handlers is thread-safe, firing them is not. However, for the moment we are happy with the current state, being able to create our own events and wiring up event handlers to have callbacks once an event is being fired.

 

The .NET standard event pattern

In theory an event could expect it's handlers to return a value. However, this is only theory and related to the fact that an event only uses a delegate type instance. In reality an event is fired without expecting any return value, since the event's originator does not require any handlers to be active.

In practice, it is possible to re-use an event handler with different instances of the same type or even different instances of different types, which have the same event pattern. While the last one might be not so good (depending on the scenario it is indeed a good solution, but usually we want to avoid this), the first case one might happen quite often. Let's consider the following code snippet:
void CreateNumberButtons()
{
    for(int i = 1; i <= 9; i++)
    {
        Button bt = new Button();
        bt.Text = i.ToString();
        bt.Dock = DockStyle.Top;
        bt.Click += MyButtonHandler;
        this.Controls.Add(bt);
    }
}
 
Here we are creating 9 buttons, which will be added to the current Form's list of controls. We assign each button a handler for the event named Click. Instead of assigning different handlers, we always re-use the same handler. The method, which should be called once the click event is fired is named MyButtonHandler. The question now is: How can we distinguish between the various buttons in this handler? The answer is simple: Let the first argument of the handler be the sender (originator) of the event! This is how our method looks like:
void MyButtonHandler(object sender, EventArgs e)
{
    Button bt = sender as Button;

    if(bt != null)
        MessageBox.Show(bt.Text);
}
 
It is also possible to specialize this signature in two ways:
  1. We could use a more specialized type for the sender. Most .NET events will use Object as the sender's type, which allows any object to be the originator. It is important to realize that this only applies to the signature of the event, not the real event, e.g. the Click event of a Button.
  2. We could use a more specialized version of EventArgs. We will now discuss what this type represents.

The second argument is an object which transports variables / a state from the event's origin to the handler. Some events just use a dummy type called EventArgs, while others use a more specialized version of EventArgs, which contains some properties (or even methods). In theory this argument does not require to be derived from EventArgs, however, in practice it is a good way of marking a type as being used as a transport package.

Now we've already seen what the .NET standard event pattern is. It is a delegate in form of
delegate void EventHandler(object sender, EventArgs e);
 
where Object and EventArgs might be more specialized depending on the event. Let's have a look at an example of a more specialized version. Every form has an event called MouseMove. This event uses another delegate named MouseEventHandler. The definition is as follows:
delegate void MouseEventHandler(object sender, MouseEventArgs e);
 
This handler does not look much different. The only difference is that a different type of package is used. Instead of the dummy (empty) EventArgs package, it is using the derived MouseEventArgs type. This package contains properties, which are filled with the corresponding values when firing the event.
class Form1 : Form
{
    Label info;

    public Form1()
    {
        info = new Label();
        info.Dock = DockStyle.Bottom;
        info.AutoSize = false;
        info.Height = 15;
        info.TextAlign = ContentAlignment.MiddleCenter;
        this.Controls.Add(info);
        this.MouseMove += HandleMove;
    }

    void HandleMove(object sender, MouseEventArgs e)
    {
        info.Text = string.Format("Current position: ({0}, {1}).", e.X, e.Y);
    }
}
 
In the given example we are creating a new Form called Form1. We add a Label to it, which will be docked at the bottom of the form. Now we are wiring up an event handler for the MouseMove event of the form. The last part is crucial, since it will not work when the mouse is moving over the Label. While some UI frameworks (like HTML, WPF, ...) have the notation of bubbling events, i.e. events that will be fired on all qualified layers and not just the top-most layer, we have to live without this feature in Windows Forms.

Now our event handler is able to retrieve information related to the event. In this case we have access to properties like X and Y, which will give us values for the X (from left) and Y (from top) value relative to the control that raised the event, which is the Form itself in this case.

 

Reflection

A programmer's job description usually does not say a word about efficient or effective code. Also payrates are usually not on a per line of code basis. So copy / paste is always an option! Nevertheless, most programmers are lazy and tend to search for more efficient ways, which result in less lines per code (no copy / paste) and more robust code (one change in the code triggers all other required changes - nothing breaks).

The CLR stores assemblies in a special way. Besides the actual (MSIL) code, a set of metadata information related to the assembly is saved as well. This metadata includes information about our defined types and methods. It does not include the exact algorithms, but the scheme. This information can be accessed and used with a concept called reflection. There are multiple ways of using reflection:
  1. Getting a Type instance at runtime by calling GetType() of an arbitrary object (instance).
  2. Getting a Type instance at compile-time by using typeof() of an arbitrary type, e.g. typeof(int).
  3. Using the Assembly class to load an assembly (the current one, a loaded assembly or an arbitrary CLR assembly from the file system).

Of course there are also other ways, but in this tutorial we are only interested in those three. Of those three we can skip the second one, since (in the end) it will boil down to the first one. So let's dive into this with a simple example in the form of the so-called Factory design pattern. This pattern is used to create a specialized version of a type depending on some parameters. Let's start by defining some classes:
class HTMLElement
{
    string _tag;

    public HTMLElement(string tag)
    {
        _tag = tag;
    }

    public string Tag
    {
        get { return _tag; }
    }
}

class HTMLImageElement : HTMLElement
{
    public HTMLImageElement() : base("img")
    {
    }
}

class HTMLParagraphElement : HTMLElement
{
    public HTMLParagraphElement() : base("p")
    {
    }
}
 
We now have three classes, with the HTMLElement class being independent and the other two being derived from it. The scenario should now be quite simple: Another programmer should not have to worry about which class to create for what kind of parameter (which will be a simple string in this case), but should just call another static method called CreateElement in a class called Document:
class Document
{
    public static HTMLElement CreateElement(string tag)
    {
        /* code to come */
    }
}
A classical way to implement this factory method would be the following code:
switch(tag)
{
    case "img":
        return new HTMLImageElement();
    case "p":
        return new HTMLParagraphElement();
    default:
        return new HTMLElement(tag);
}
 
Now the problem with this code is that we have to specify the tag name over and over again. Of course we could change the "img" or "p" strings to constants, however, we still have to maintain a growing switch-case block. Just adding new classes is only half of the job. This results in a maintainability problem. A good code would maintain itself. Here is where reflection comes to help.
Let's rewrite the implementation using reflection:
class Document
{
    //A (static) key-value dictionary to store string - constructor information.
    static Dictionary<string, ConstructorInfo> specialized;

    public static HTMLElement CreateElement(string tag)
    {
        //Has the key-value dictionary been initialized yet? If not ...
        if(specialized == null)
        {
            //Get all types from the current assembly (that includes those HTMLElement types)
            var types = Assembly.GetCallingAssembly().GetTypes();

            //Go over all types
            foreach(var type in types)
            {
                //If the current type is derived from HTMLElement
                if(type.IsDerivedFrom(typeof(HTMLElement)))
                {
                    //Get the constructor of the type - with no parameter
                    var ctor = type.GetConstructor(Type.Empty);

                    //If there is an empty constructor (otherwise we do not know how to create an object)
                    if(ctor != null)
                    {
                        //Call that constructor and treat it as an HTMLElement
                        var element = ctor.Invoke(null) as HTMLElement;

                        //If all this succeeded add a new entry to the dictionary using the constructor and the tag
                        if(element != null)
                            specialized.Add(element.Tag, ctor);
                    }
                }
            }
        }

        //If the given tag is available in the dictionary then call the stored constructor to create a new instance
        if(specialized.ContainsKey(tag))
            return specialized[tag].Invoke(null) as HTMLElement;

        //Otherwise this is an object without a special implementation; we know how to handle this!
        return new HTMLElement(tag);
    }
}
 
It is obvious that the code got a lot longer. However, we will also realize that this is robust solution that works perfectly for this special case. What does the code do exactly? Most of the code is actually spent in building up a dictionary, that is then used to map certain scenarios (in this case certain tags) to a proper type (in this case a proper method in form of the corresponding constructor). After this is done (this has to be invoked only once), the former switch-case reduces to those three line:
    if(specialized.ContainsKey(tag))
        return specialized[tag].Invoke(null) as HTMLElement;
    return new HTMLElement(tag);
 
That's short and easy, isn't it? That's the beauty and magic of reflection! The code now extends itself when adding new classes:
class HTMLDivElement : HTMLElement
{
    public HTMLDivElement() : base("div")
    {
    }
}

class HTMLAnchorElement : HTMLElement
{
    public HTMLAnchorElement() : base("a")
    {
    }
}
 
We now just added two more classes, but we do not have to care about the maintenance of our factory method. In fact everything will work out of the box!

Let's step aside for one second and consider another example of using reflection. In the previous tutorial we had a look at anonymous objects. One way to use anonymous objects has been to initialize them with the var keyword (for enabling type inference). Therefore we could do the following:
var person = new { Name = "Florian", Age = 28 };
 
This works perfectly and we can access the members of the anonymous object within the current scope. However, once we have to pass this kind of object we are missing the correct type. We now have three options available:
  1. We do not use an anonymous type, but create a class to cover the data encapsulation.
  2. We use the DLR as presented in the next section.
  3. We change the specific argument type of the caller method to be a very general Object type and use reflection.

Since the current section discusses reflection we will try option number three. Let's have a look at the code snippet in a bigger context:
void CreateObject()
{
    var person = new { Name = "Florian", Age = 28 };
    AnalyzeObject(person);
}

void AnalyzeObject(object o)
{
    /* use reflection here */
}
 
The question now is: What's the purpose of the AnalyzeObject method? Let's assume that we are only interested in the properties of the given object. We want to list their name, type and current value. Of course the GetType() method will play a very important role here. The implementation could look like the following code snippet:
//Get the type information
Type type = o.GetType();
//Get an array with property information
PropertyInfo[] properties = type.GetProperties();

//Iterate over all properties
foreach(var property in properties)
{
    //Get the name of the property
    string propertyName = property.Name;
    //Get the name of the type of the property
    string propertyType = property.PropertyType.Name;
    //Get the value of the property given in the instance o
    object propertyValue = property.GetValue(o);
    Console.WriteLine("{0}\t{1}\t{2}", propertyName, propertyType, propertyValue);
}
 
This all works quite nicely. The lesson here is regarding the GetValue method of the PropertyInfo class. This method is obviously interested in getting the value of an instance that has this specific property. It is important to differentiate between the pure type information, obtained by using GetType on an instance, and an instance. The instance is built upon the scheme described in a type. The type itself does not know about any instance.

However, there is special case, in which it is sufficient to only pass in null as instance. Consider the following case:
class MyClass : IDisposable
{
    static int instances = 0;
    bool isDisposed;

    public static int Instances
    {
        get { return instances; }
    }

    public MyClass()
    {
        instances++;
    }

    public void Dispose()
    {
        isDisposed = true;
        instances--;
    }

    ~MyClass()
    {
        if(!isDisposed)
            Dispose();
    }
}
 
This class keeps track of its instances. What if we want to get the value of the property Instances by using reflection? Since Instances is a static property, the property itself is independent of a particular class instance. So the following code would work in this case:
var propertyInfo = typeof(MyClass).GetProperty("Instances");
var value = propertyInfo.GetValue(null);
 
Reflection requires null to be passed in as an argument quite often, however, one should always read the documentation related to a method before deciding what parameters would be the best choice for a special case.

Before stepping into the world of dynamic programming we will investigate one other interesting option with reflection: obtaining method information. Of course there are several other interesting possibilities, e.g. reading attributes or creating new types on the fly with the Emit method.

Method information is quite similar to obtaining property information and even more similar to obtaining constructor information. In fact PropertyInfo, MethodInfo and ConstructorInfo all inherit from MemberInfo, with MethodInfo and ConstructorInfo indirectly inheriting from it while directly inheriting from MethodBase.

Let's do the same thing as before with the anonymous object, but now reading out all available methods:
//Get the type information
Type type = o.GetType();
//Get an array with method information
MethodInfo[] methods = type.GetMethods();

//Iterate over all methods
foreach(var method in methods)
{
    //Get the name of the method
    string methodName = method.Name;
    //Get the name of the return type of the method
    string methodReturnType = method.ReturnType.Name;
    Console.WriteLine("{0}\t{1}", methodName, methodReturnType);
}
 
Reading out a value is much harder in this case, since we could only do this if a method has no parameters. Otherwise we would be required to find out which parameters are actually required. Nevertheless this is possible and could lead to a very simple unit testing tool, which only looks at public methods and tries to call them with default values (a default value would be null for any class and a logical default value for any structure, e.g. Int32 has a default value of 0).

If we execute the code above we will be surprised. This is the output that we probably expected, knowing that any type derives from Object, which gives us already 4 methods:
Equals  Boolean
GetHashCode Int32
ToString    String
GetTypes Type
 
However, this is the output that has been displayed:
get_Name    String
get_Age Int32
Equals  Boolean
GetHashCode Int32
ToString    String
GetTypes    Type
 
We see that two new methods have been inserted by the compiler. One method is called get_Name and returns a String object, while the other one is called get_Age and returns an integer. It is quite obvious that the compiler did transform our properties into methods. So overall any property is still a method - the GetProperty() or GetProperties() methods are just short-ways to access them without iterating over all methods.

In the end reflection can therefore also teach us a lot of things about MSIL and the C# compiler. All the things we investigated here is reflected (pun intended) in the following scheme, which shows the object-tree that is used by reflection.


 

Dynamic programming

In the previous section we've already mentioned that another possibility to pass anonymous objects would be to use dynamic programming. Let's see some code before we will actually dive into the Dynamic Language Runtime (DLR), which enables us to use dynamic programming in contrast to static programming with the CLR:
void CreateObject()
{
    var person = new { Name = "Florian", Age = 28 };
    UseObject(person);
}

void UseObject(object o)
{
    Console.Write("The name is . . . ");
    //This will NOT work!
    Console.WriteLine(o.Name);
}
 
We are doing the exact some thing as before, but now we are not interested in analyzing information about the type of the given instance, but we are interested in actually using some properties or methods of the given Object. Of course the code above does not compile, since Object does not have a Name property. But what if the actual type has a property with this name? Is there a way to tell the compiler that it should ignore the error and that we try to map it at runtime? The answer is of course yes. As with var, the magic lies in a keyword type called dynamic. Let's change our code:
void UseObject(dynamic o)
{
    Console.Write("The name is . . . ");
    //This will work!
    Console.WriteLine(o.Name);
}
 
Everything works as before. All we had to do is change the signature of the method. If we now type in o. in the body of the method UseObject, we will not get any IntelliSense support. This is a little bit annoying, but on the other hand we would have no IntelliSense support when using reflection as well!

So is this the end of the story? Of course not! First of all we need to realize that every standard CLR object can be treated as a dynamic object. So the following all works:
int a = 1;//a is of type Int32
var b = 1;//b is of type Int32 (inferred)
dynamic c = 1;//c is of type dynamic -- only known at runtime (but will be Int32)
object d = 1;//d is of type object, but the actual type is Int32
 
This seems to make no difference between the various definitions. However, there are actually a lot of differences. Let's use those variables:
var a2 = a + 2;//works, Int32 + Int32 = Int32
var b2 = b + 2;//works, Int32 + Int32 = Int32
var c2 = c + 2;//works, dynamic + Int32 = dynamic
var d2 = d + 2;//does not work, object + Int32 = undef. 
 
While int is a real type (mapped to Int32), var is (in this case) only a keyword for the compiler to infer the type (which will be Int32). object is a real type, which boxes the actual type in this case from Int32. Now the first three operations worked - are they equal? Again we have to answer with no. Let's have a look at this code snippet:
a = "hi";//Ouch!
b = "hi";//Ouch!
c = "hi";//Works!
d = "hi";//Works!
 
Here the first two assignments will result in compilation errors. A string cannot be assigned to a variable of type Int32. However, a string can be casted into an Object. Also dynamic means that the actual type might change to whatever type at runtime.

So far we learned that dynamic variables are in fact quite dynamic. They provide all capabilities of the actual type behind a variable, but without the ability to change that type. However, with great power comes great responsibility. This might result in problems as in the following code snippet:
dynamic a = "32";
var b = a * 5;
 
The compiler will not complain about using the multiplication with a `String. However, at runtime we will get a really bad exception at this point. Detecting such lines might look easy in the example, but in reality it looks much more like the following code snippet:
dynamic a = 2;
/* lots of code */
a = "Some String";
/* some more code */
var b = 2;
/* and again more code */
var c = a * b;
 
Now it's not so obvious any more. The complication arises due to the number of possible code paths. Dynamic programming offers some advantages in the area of mapping functionality. For instance using methods with dynamic types will always result in taking the closest matching overload. Let's have a look at an example to see what this means:
var a = (object)2;//This will be inferred to be Object
dynamic b = (object)2;//This is dynamic and the actual type is Int32
Take(a); //Received an object
Take(b); //Received an integer

void Take(object o)
{
    Console.WriteLine("Received an object");
}

void Take(int i)
{
    Console.WriteLine("Received an integer");
}
 
Even though we assigned a variable of type Object to a dynamic variable the DLR still managed to pick the right overload. A question everyone has to answer for himself is: Is that what we really wanted? Usually if we already picked dynamic programming the answer is yes, but coming from static programming the answer is no. Still it's nice to know that such a behavior is possible with C# and the DLR.

Up until here, we should have learned the following key points:
  • dynamic tells the compiler to let the variable be handled at runtime by the DLR.
  • Dynamic variables can be combined with any other variable resulting in a dynamic instance again.
  • The CLR type is still available, however, only at runtime. This makes predictions at compile-time impossible.
  • If an operation or method call with a dynamic variable is not available, we will get ugly exceptions.

One thing we might be interested right now: Where can we use this kind of dynamic programming? So let's have a look at an interesting picture:

The Dynamic Language Runtime and its relation to other languages.

We see that the DLR is the layer that connects .NET languages (like C#) or flavors (like IronRuby) to various kinds of objects (like CLR objects or Python dynamic objects etc.). This means that anything that is dynamic supplies a binding mechanism (we could also write our own) that could be supported in a .NET language. This means that we can actually write a C# program that interacts with a script written in Python!

There are two more key lessons that should be learned in this section. The first will show us how to create our own dynamic type. The second one will give us some insight about practical usage of the DLR.

The DLR defines its types by implementing a special kind of interface. In this section we do not care about the exact details, but rather on some interesting classes, that already implement this interface. Right now there are two really interesting classes, called ExpandoObject and DynamicObject. As an example we will now build our own type based on the DynamicObject. Let's name this type Person.
class Person : DynamicObject
{
    //This will be responsible for storing the properties
    Dictionary<string, object> properties = new Dictionary<string, object>();

    public override bool TryGetMember(GetMemberBinder binder, out object result)
    {
        //This will get the corresponding value from the properties
        return properties.TryGetValue(binder.Name, out result);
    }

    public override bool TrySetMember(SetMemberBinder binder, object value)
    {
        //binder.Name contains the name of the variable
        properties[binder.Name] = value;
        return true;
    }

    public Dictionary<string, object> GetProperties()
    {
        return properties;
    }

    public override string ToString()
    {
        //Our object also has a specialized string output
        StringBuilder sb = new StringBuilder();
        sb.AppendLine("--- Person attributes ---");

        foreach (var key in properties.Keys)
        {
            //We use the chaining property of the StringBuilder methods
            sb.Append(key).Append(": ").AppendLine(properties[key].ToString());
        }

        return sb.ToString();
    }
}
 
How can this type be used? Let's view an example:
dynamic person = new Person();
person.Name = "Florian";
person.Age = 28;
Console.WriteLine(person);
person.Country = "Germany";
Console.WriteLine(person);
 
This makes extending the existing object quite easy and everything works out-of-the-box like magic. Let's now go on and look at a practical example. Usually one would pick communication with a dynamic scripting language (JavaScript, Python, PHP, Ruby, ...), however, we will do something different.

Using the .NET-Framework class XmlDocument we are able to access the node members in a more elegant way. The usual way would look like the following code snippet:
var document = new XmlDocument("path/to/an/xml/document.xml");
var element = document.GetElement("root").GetElement("child");
 
Using the DLR we can rewrite this to become:
dynamic document = new XmlDocument("path/to/an/xml/document.xml");
var element = document.root.child;
 
It is important to notice that element will be of type dynamic again, since document is dynamic. Also it should be noted again that if either root nor child exist as nodes, we will face some serious exceptions.

Another use-case of the DLR is in interop with COM-applications like Microsoft Office (Access, Excel, Word, ...).

 

Accessing the file system

Before we go over into the interesting topic of multi-threading, concurrent and asynchronous programming, we will start using the System.IO namespace of the .NET-Framework. All classes in these methods deal with input / output, mostly with the file system.

Let's consider some simple tasks: We want to get information about a directory or we want to know everything about a certain file. This means we need to read out information of the file system. However, due to a lucky coincidence Windows knows everything and has some good APIs to do the communication. Thanks to the .NET-Framework those APIs are accessible in a object-oriented way for us.

Where should we start? The static class Path contains a lot of useful helpers and general variables like the path separator (Windows uses a backslash). We also have direct classes like Directory and File, which can be used to do some immediate actions. Additionally we have data encapsulations like DirectoryInfo, FileInfo or the DriveInfo class. The whole ACL (Access Control List) model can also be accessed using special classes and methods.

Let's create a sample project using Windows Forms. On the main windows we are placing two buttons and a ListBox control. The two buttons should get an event handler for the Click event, the ListBox control should get an event handler for the SelectedIndexChanged event. This event will be called once the user changes the currently selected item in this control.

Our sample application should load all files (we are only interested in the names of these files) from a certain directory, as well as all currently used drive letters. When we press the first button the Listbox control should be populated. The second press should only be enabled if we have selected a valid file in the ListBox control. This Button control should then trigger a MessageBox to show a text representation of the content of the selected file.
public partial class Form1 : Form
{
    public Form1()
    {
        InitializeComponent();
    }

    //What happens if the first button is clicked?
    private void button1_Click(object sender, EventArgs e)
    {
        //String Literals have an @ symbol before the string
        //There \ is no escape sequence starter
        string path = @"C:\Windows";

        //Reading out the files of the given directory
        string[] files = Directory.GetFiles(path);

        //Reading out all drives
        DriveInfo[] drives = DriveInfo.GetDrives();

        //Adding all files
        foreach (var file in files)
        {
            //The listbox collection takes arbitrary objects as input
            //and uses the ToString() for drawing
            FileInfo fi = new FileInfo(file);
            listBox1.Items.Add(fi);
        }

        //Adding all drives, however, one-by-one.
        foreach (var drive in drives)
        {
            listBox1.Items.Add(drive);
        }
    }

    //What happens if the second button is clicked?
    private void button2_Click(object sender, EventArgs e)
    {
        //Just be sure that we really have an item selected
        var fi = listBox1.SelectedItem as FileInfo;

        //if we have an item selected AND that item is of type FileInfo then ...
        if (fi != null)
        {
            //Read that file and show it in the MessageBox (beware of large and non-text files!)
            string text = fi.OpenText().ReadToEnd();
            MessageBox.Show(text);
        }
    }

    //What if we change the selected index (i.e. pick another item)?
    private void listBox1_SelectedIndexChanged(object sender, EventArgs e)
    {
        //We only want to allow button2 to be enabled if a file is selected
        button2.Enabled = (listBox1.SelectedItem as FileInfo) != null;
    }
}
 
In this example we are already using various types from the System.IO namespace. We are actively using Directory to get a String array with file names. We are also using FileInfo to encapsulate a given filename (String) as a file object. Additionally we used DriveInfo to obtain an array of DriveInfo instances. A DriveInfo object is an encapsulation of everything related to a certain drive. It would have been also possible to use the static GetFiles method of the DirectoryInfo class. This method would have given us an array of FileInfo objects directly.

Now that we have an impression how the communication with the file system looks like, it is time to start with reading and writing files.

 

Streams

The reason that the System.IO namespace is not named System.FS or System.FileSystem is quite simple: Communication with the file system is just one use-case of input and output - there are many more. In fact placing some string in the console is already a form of output. Reading a string or arbitrary keyboard input from the console is also input.

Some input and output operations are managed in a stream, i.e. a sequence of data elements which is made available over time. A stream can be thought of as a conveyor belt that allows items to be processed one at a time rather than in large chunks. There are all kinds of streams, like a memory stream, a file stream, input stream (e.g. from the keyboard) or output stream (e.g. to the display). Every IO operation is about streaming data. This is the reason that the System.IO namespace is mostly about the Stream class and its implementations.

The Stream class itself is abstract, since every stream of data is dependent of the corresponding device, e.g. HDD, RAM, ethernet, modem. Therefore there has to be a specific implementation for the corresponding device. Sometimes it might make sense to implement our own Stream. Reading a file is possible by using a specialized class like FileStream. In the previous section we've already seen that there are some helper methods available. There we used the OpenText method of the FileInfo class to create a StreamReader object. This is a specialized version of the TextReader class, which requires a Stream. In the end we could just use the ReadToEnd method and do not have to worry about how to use a Stream.

Let's see how we could use the FileStream class. In the code snippet below we will open a file called test.txt:
//Open the file test.txt (for reading only)
FileStream fs = new FileStream("test.txt", FileMode.Open);
//Allocate some memory
byte[] firstTenBytes = new byte[10];

//Read ten bytes and store them in the allocated memory starting
while(fs.Read(firstTenBytes, 0, 10) != 0)
{
    Console.WriteLine("Could read some more bytes!")
}

//Quite important: Closing the stream will free the handle!
fs.Close();
 
The code above opens a file and reads 10 bytes until there are no more bytes to read. We could also advance byte-by-byte using the ReadByte method. While Read returns the actual number of bytes read (in this case it would be 10 until the end is reached, where any number from 0 to 10 is possible), ReadByte returns the actual value of the byte in form of an Int32. The reason for this is that the byte type is an unsigned 8 bit integer, which has no possibility of telling us that the end of the stream is reached. Using an Int32 we can check if we reached the end of the stream (in this case the end of the file) by checking if the result has been -1.

Writing a file is quite similar to reading it. Here we just use FileMode.Create to specify that we do not want to open an existing file, but create a new one. Now methods like WriteByte or Write can be invoked since the stream is writable. One thing we've seen above is now getting a lot more crucial: After our operations we have to close the file by disposing / closing the Stream object. This can be achieved with the method Close.

There are two more important concepts in the Stream class:
  1. Depending on the device, bytes could be buffered before any actual write. Therefore if immediate writing is required we have to use the Flush method.
  2. Every stream has a Position property, which is the current insertion / marker point. Any write or read operation will take place starting at that point. Therefore if we used a stream to go from start to end, we have to reset the Position marker to the beginning in order to start again.

Before we go on we should have a closer look at the already mentioned TextReader and TextWriter classes. Those two classes do not derive from Stream since they serve a different purpose. Those two classes are specialized on reading or writing text. The text could be in a raw byte array, in a string or in a stream. For each scenario there is specific implementation. Since this section is about streams, we will introduce the implementations in form of StreamReader and StreamWriter.

Why should we use a StreamReader instance for handling a FileStream of a text file? The magic word here is: Encoding! In order to save text we need to specify how characters are mapped to numbers. The first 127 numbers are always mapped according to the ASCII standard. In this standard we have normal letters like a being 0x61, A being 0x41 as well as numbers like 0 being 0x30. However, we also have special characters like a new line \n being 0x0a or a backspace \b being 0x08. The problem now is that the other (usually more regional common) characters are depending on the mapping. There are several encodings like UTF-8 or UTF-16 or Windows-1252. The main question is: How to find out and how to use.

The .NET-Framework has a (quite extensive list) of available encodings. Any Encoding instance has methods like GetString or GetChar, however, the TextReader / TextWriter methods use them already. We can either specify an encoding when creating a StreamReader or StreamWriter or let the object detect the encoding. While the currency of a Stream object is a Byte, the currency of a TextReader is a Char.

Let's see how we can use the StreamWriter to create some text:
StreamWriter sw = new StreamWriter("myasciifile.txt", false, Encoding.ASCII);
sw.WriteLine("My First ASCII Line!");
sw.WriteLine("How does ASCII handle umlauts äöü?");
sw.Close();
 
From the first tutorial we know that the Char datatype is a 16-bit unsigned integer. Therefore we might know or not that C# uses UTF-16 to store characters. While a UTF-8 displayed character can consist of 1 to 4 UTF-8 characters, a UTF-16 displayed character can consist of 1 to 2 UTF-16 characters. The minimum payload here is 2 or twice as much as with UTF-8. This should just motivate us to think about encoding when actually passing some characters from a memory object that's being used in .NET purely for other systems. If the encoding is different or not expected, the (displayed) text will differ from the original one.

Now we are actually on the point where things are starting to get interesting. The Stream class also contains methods that end with async. In some cases we actually might be more interested in using ReadAsync and WriteAsync than their sequential counterparts Read and Write. In the next sections we will dive into asynchronous programming using C#.

 

Threads

One of the first examples in this tutorial has shown a simulation of the Application.Run() message loop. The code has been designed to exit quite quickly, however, commenting the exit-criteria will result in a permanent loop with firing a lot of events. If we would place a lot of code in our event-handler, then we would actually block the event loop from continuation.

This is exactly the problem that happens quite often in UI applications. An event (let's say a Click event of a Button control) has been fired and we perform a lot of operations in our handler. Worst of all we are not only performing some calculations, but some IO operations like writing some file or downloading some data. If the operation takes quite long (let's say more than a second) the user will experience that the application becomes unresponsive for that time. The reason is quite simple: as before, the message loop is blocked from continuation, which prevents those other events (like moving the form, clicking on some other button or something different) from being fired. After our operation, all queued messages will be handled.

Of course we therefore do not want to block the UI thread. Since the messages in the message loop get pumped once the application is idle, the idle state is very important. Doing too much work in the UI thread (like in an event handler) will result in a non-responsive app. Therefore the OS created two models: threads and callbacks. Callbacks are just events, which we do know already. If we can respond to a change of state with an event, then we should pick this way. Otherwise we might pick the solution of spawning a thread which uses polling to get notified in case of change of state. In this section we will look at how we can create and manage threads in the .NET-Framework.

Every application (no matter if console or graphical) comes already with 1 thread: this is the application / GUI thread. If we use multiple threads we might have the advantage of a faster execution with several (CPU) cores. The reason is that the OS distributes the threads across different cores (if available), since threads are considered to be independent units of work in the context of a process. Even on one core the OS is handling threads by assigning them CPU time. So even on just one core we might have an advantage in form of a more responsive UI. When the OS schedules CPU time it takes into account that the thread is only in a spin-lock state and does not require the maximum computing time.

Now the remaining question is: How can we create threads? First of all we need a method that should run in that thread. The first thread in every application is started by the standard process model - that is the Main method in C#. Now that we are programming the other threads by hand, we can start whatever method.

The class Thread represents a thread, with the constructor requiring a delegate of the method to run. This class is available in the namespace System.Threading. Let's look at an example:
static void Main(string[] args)
{
    //Creating a thread and starting it is straight forward
    Thread t = new Thread(DoALotOfWork);
    //Just pass in a void Method(), or void Method(obj) and invoke start
    t.Start();

    //Wait for the process to be finished by pressing the ENTER key
    Console.ReadLine();
}

static void DoALotOfWork()
{
    double sum = 0.0;
    Random r = new Random();
    Debug.WriteLine("A lot of work has been started!");

    //some weird algorithm
    while(true)
    {
        double d = r.NextDouble();
        //The Math class contains a set of useful functions
        if (d < 0.333) sum += Math.Exp(-d);
        else if (d > 0.666) sum += Math.Cos(d) * Math.Sin(d);
        else sum = Math.Sqrt(sum);
    }

    Debug.WriteLine("The thread has been stopped!");
}
 
In the example above we start a new thread that uses the DoALotOfWork method as entry point. Now we could enter some text or stop the program while work is still being done. This is because threads are running concurrently. If we have two threads, then two things can be done. No one can tell us about the order of work. However, there is one big difference between the new thread and the existing thread: Exceptions from worker threads will bubble up to the process thread!

Also the following considerations should be taken into account:
  • Spawning multiple threads results in overhead, which is why we should consider using the ThreadPool class for many threads.
  • Changing the UI from a worker thread is not possible and will result in exceptions.
  • We have to avoid race conditions, i.e. solving non-independent problems becomes challenging since the order of execution is not guaranteed.
 Therefore one of the biggest problems is: How to communicate between threads?

 

Thread-communication

Right now we only know what threads are and how to start them. At this point the threading concept looks more like overhead, since we might gain a responsive application, however, we have no way to communicate back any result of the thread's execution.

In order to synchronize threads, C# introduces the lock keyword. The lock statement blocks usage on certain lines of code, which are condensed in a scope block. This is like a barrier. Barriers help to reduce race conditions, with the barrier status (set / unset) being determined by a pointer (memory address). A pointer can be given by a reference type like any Object.

Let's have a look at a short example using two threads:
//This object is only used for the locks
static Object myLock = new Object();

static void Main(string[] args)
{
    //Create the two threads
    Thread t1 = new Thread(FirstWorker);
    Thread t2 = new Thread(SecondWorker);

    //Run them
    t1.Start();
    t2.Start();   

    Console.ReadLine();
}

static void FirstWorker()
{
    //This will run without any rule
    Console.WriteLine("First worker started!");

    //The rule for the following block is: only enter
    //when myLock is not in use, otherwise wait
    lock (myLock)
    {
        Console.WriteLine("First worker entered the critical block!");
        Thread.Sleep(1000);
        Console.WriteLine("First worker left the critical block!");
    }

    //Finally print this
    Console.WriteLine("First worker completed!");
}

static void SecondWorker()
{
    Console.WriteLine("Second worker started!");

    //The rule for the following block is: only enter
    //when myLock is not in use, otherwise wait
    lock (myLock)
    {
        Console.WriteLine("Second worker entered the critical block!");
        Thread.Sleep(5000);
        Console.WriteLine("Second worker left the critical block!");
    }

    //Finally print this
    Console.WriteLine("Second worker completed!");
}
 
If we run the program multiple times we will (usually) get different outputs. This is normal since we cannot tell which thread will be started first by the operating system. In fact our program is just telling the OS to start a thread, i.e. the OS can decide when the operation is performed.

Using the lock statement it is quite simple to mark critical blocks and ensure coherence in a multi-thread program. However, this does not solve our problem with GUI programming, where we are not allowed change the UI from a different thread than the GUI thread.

To solve this problem every Windows Forms control has a method called Invoke. Also other UI frameworks like WPF have something similar. In WPF we can use the (more general) Dispatcher property. However, the most general way of doing thread-safe UI calls is over the SynchronizationContext class. This class is available everywhere, even for console applications. The idea is the following: A direct communication between threads is not possible (since it is not thread-safe), however, one thread might call (or start) a method in the context of the other thread.

What does that mean? Thinking of a GUI we can easily construct a use case. We create a Windows Forms application with a Label, aProgresBar and a Button control. Once the user presses the button a new thread is started, which performs a (long-running) computation. This computation has some fixed points, where we know that some percent of the overall computation have already been done. At those points we use a globally available SynchronizationContext instance to start a method in the GUI thread, which sets the ProgressBar value to a given value. At the end of the computation we are using again the SynchronizationContext to change the Text property of Label to a given value.

Let's have a look at the scheme of this example:
public class Form1 : Form
{
    SynchronizationContext context;
    bool running;

    public Form1()
    {
        InitializeComponent();
        //The Current property gets assigned when a Form / UI element is created
        context = SynchronizationContext.Current;
    }

    void ButtonClicked(object sender, EventArgs e)
    {
        //We only want to do the computation once at a time
        if(running)
            return;

        running = true;
        Thread t = new Thread(DoCompute);
        t.Start();
    }

    void DoCompute()
    {
        /* Start of long lasting computation */
        context.Send(_ => { 
            progressBar1.Value = 50;
        }, null);
        /* End of long lasting computation */
        context.Send(_ => { 
            progressBar1.Value = 100;
            label1.Text = "Computation finished!";
            running = false;
        }, null);
    }
}

The static property Current of the SynchronizationContext carries the (set) sync. context of the current (!) thread. Therefore if we want to use the value of Current mapping to the GUI thread, we need to store it while being in the GUI thread. This property is not set automatically. It has to be set somewhere. In our case, however, this property is set by the Windows Forms Form instance.

Now that we have a rough understanding how we can avoid race conditions and cross-threading exceptions we can move on to a much more powerful and general concept: Tasks!

 

The Task Parallel Library

In the .NET-Framework 4.0 a new library has been introduced: the Task Parallel Library (TPL). This has some powerful implications. The most notable for us is the new datatype named Task. Some people consider a Task being a nicely wrapped Thread, however, a Task is much more. A Task could be a running thread, however, a Task could also be all-we-need from a callback. In fact a Task does not say anything about the resource that's being used. If a Thread is used, then in a much more reliable and performance improved way. The TPL handles an optimized thread-pool, which is specialized on creating and joining several threads within a short period of time.

So what is the TPL? It is a set of useful classes and methods for tasks and powerful (parallel) extensions to LINQ in form of PLINQ. The PLINQ part can be triggered by calling the AsParallel extension method before calling other LINQ extension methods. It should be noted, that PLINQ queries usually tend to run slower than their sequential counterparts, since most queries do not have enough computational time required to justify the overhead of creating a thread.

The following picture illustrates the placement of the TPL and attached possibilities.

The TPL sits on top of the CLR threadpool and brings us some useful new types and methods.

The TPL gives us very elegant methods of parallelizing computational challenging methods across different threads. For instance if we use Parallel.For() we can split loops in chunks, which are distributed among different cores. However, we need to be careful with race conditions and overhead due to creation and management of corresponding threads. Therefore the best case is obviously found in a loop with many iterations and a huge workload in the loop body, which is independent of other iterations.

Let's see how the TPL would help us to parallelize a for-loop. We start with the sequential version:
int N = 10000000;
double sum = 0.0;
double step = 1.0 / N; 

for(var i = 0; i < N; i++)
{
    double x = (i + 0.5) * step;
    sum += 4.0 / (1.0 + x * x); 
}

return sum * step;
 
The simplest parallel version would be the following:
object _ = new object();
int N = 10000000;
double sum = 0.0;
double step = 1.0 / N; 

Parallel.For(0, N, i =>
{
    double x = (i + 0.5) * step;
    double y = 4.0 / (1.0 + x * x);
    lock(_) 
    {
        sum += y;
    }
});

return sum * step;
 
The reason for the requirement of a lock-block is the required synchronization. Therefore this very simple version is not really performant, since the synchronization overhead is much more than we gain by using multiple processors (the workload besides the synchronization is just too small). A better version uses another overload of the For method, which allows the creation of a thread-local variable.
object _ = new object();
int N = 10000000;
double sum = 0.0;
double step = 1.0 / N; 

Parallel.For(0, N, () => 0.0, (i, state, local) =>
{
    double x = (i + 0.5) * step;
    return local + 4.0 / (1.0 + x * x);
}, local => 
{
    lock (_)
    {
        sum += local;
    }
});

return sum * step;
 
This does not look too different. There must be several questions:
  1. Why is this more efficient? Answer: Because we use the lock section only once per thread instead of once per iteration we actually drop a lot of the synchronization overhead.
  2. Why do we need to pass in another delegate as the third parameter? In this overload the third parameter is the delegate for creating the thread-local variable. In this case we are creating one double variable.
  3. Why cannot we just pass in the thread-local variable? If the variable would have already been created it would not be thread-local but global. We would pass in every thread the same variable.
  4. How should it be distinguished? The signature of the delegate for the body changed as well. 5.What is state and local? The state parameter gives us access to actions like breaking or stopping the loop execution (or realizing in what state we are), while the local variable is our access point to the thread-local variable (in this case just a double).
  5. What if I need more thread-local variables? How about creating an anonymous object or a instantiating a defined class? Since the TPL is no magic stick we still have to be aware of race conditions and shared resources. Nevertheless with the TPL a new set of (concurrent) types is introduced as well, which is quite helpful in dealing with such problems.

In the last part of this section we should also discuss the consequences of a Task type. Tasks have some very nice features, most notable:
  • Tasks provide a much cleaner access to the current state.
  • The cancellation is much smoother and well-defined.
  • Tasks can be connected, scheduled and synchronized.
  • Exceptions from tasks do not bubble up unless requested!
  • A Task does not have a return type, but a Task<T> has return type T.

The last one is really important for us. If we start a new task that is computionally focused we might be interested in the result of the computation. While doing this with a Thread requires some work we get this behavior out-of-the-box with a Task.

Nowadays everything is centered around the Task class. We will now look closer at some of those properties.

 

Tasks and threads

As already mentioned there is a big difference between a Task and a Thread. While a Thread is something from the OS (a kind of resource), a Task is just some class. In a way we might say that a Task is a specialization of a Thread, however, this would not be true since not all running Task instances are based on a Thread. In fact all IO bound asynchronous methods in the .NET-Framework, which return a Task<T> are not using a thread. They are all callback based, i.e. they are using some system notifications or already running threads from drivers or other processes.

Let's recap what we learned about using threads:
  • The workload has to be big enough, i.e. at least as much instructions as time is required for creating and ending the thread (this is roughly about 100000 cycles or 1 ms, depending on architecture, system and OS).
  • Just running a problem on more cores does not equal more speed, i.e. if we want to write a huge file to the hard disk it does not make sense to do that with multiple threads, since the hardware might be already saturated with the amount of bytes that are being received from one core.
  • Always think about IO-bound vs CPU-bound. If the problem is CPU-bound then multiple threads might be a good idea. Otherwise we should look for a callback solution or (worst-case, but still much better than using the GUI thread) create only one thread.
  • Reducing the required communication to a minimum is essential when aiming for an improved performance when using multiple threads.

We can already see why using Task based solutions are preferable in general. Instead of providing two ways of solving things (either by creating a new thread, or by using a callback) we only need to provide one way of interacting with our code: the Task class. This is also the reason why the first asynchronous programming models of the .NET-Framework are being replaced by return corresponding Task instances. Now the actual resource (callback handler or a thread) does not matter anymore.

Let's see how we can create a Task for computational purposes:
Task<double> SimulationAsync()
{
    //Create a new task with a lambda expression
    var task = new Task<double>(() =>
    {
        Random r = new Random();
        double sum = 0.0;

        for (int i = 0; i < 10000000; i++)
        {
            if (r.NextDouble() < 0.33)
                sum += Math.Exp(-sum) + r.NextDouble();
            else
                sum -= Math.Exp(-sum) + r.NextDouble();
        }

        return sum;
    });

    //Start it and return it
    task.Start();
    return task;
}
 
There is no requirement, however, the usual convention is to return a so called hot task, i.e. we only want to return already running tasks. Now that we have this code running we could do some things:
var sim = SimulationAsync();
var res = sim.Result;//Blocks the current execution until the result is available
sim.ContinueWith(task =>
{
    //use task.Result here!
}, 
TaskScheduler.FromCurrentSynchronizationContext());  
//Continues with the given lambda from the current context

We could also spawn multiple simulations and use the one which finishes first:
var sim1 = SimulationAsync();
var sim2 = SimulationAsync();
var sim3 = SimulationAsync();

var firstTask = Task.WhenAny(sim1, sim2, sim3);//This creates another task! (callback)

firstTask.ContinueWith(task =>
{
    //use task.Result here, which will the the result of the first task that finished
    //task is the first task that reached the end
}, TaskScheduler.FromCurrentSynchronizationContext());

Unfortunately not all features can be covered in this tutorial. One feature, however, we have to analyze a lot more is the possibility to continue a task. In principle such a continuation could solve all our problems.

 

Awaiting async methods

In the latest version of C#, called C# 5, two new keywords have been introduced: await and async. By using async we mark methods as being asynchronous, i.e. the result of the method will be packaged in a Task (if nothing is returned) or Task<T> if the return type of the method would be T. So the following two methods,
void DoSomething()
{
}

int GiveMeSomething()
{
    return 0;
}
would transform to
Task async DoSomethingAsync()
{
}

Task<int> async GiveMeSomethingAsync()
{
    return 0;
}
 
The end of the names has been changed as well, however, this is just a useful convention and not required. A very useful implication of transforming the inside of a method into a Task is that it could be continued with another Task instance. Now all our work here would be useless if there would not be another keyword, which solves this continuation step automatically. This keyword is called await. It can only be used inside async marked methods, since only those methods will be changed by the compiler. At this point it is important to emphasize again that a Task is not a Thread, i.e. we do not say anything here about spawning new threads or which resources to use.

The purpose is to write (what looks like) sequential code, which runs concurrently. Every async marked method is always entered from the current thread until the await statement triggers new code execution in what-could-be another thread (but does not have to be - see: IO operations). Whatever happens, the UI stays responsive in that time, since the rest of the method is already transformed to a continuation of this triggered Task. The scheme is as follows:
async Task ReadFromNetwork(string url)
{
    //Create a HttpClient for webrequests
    HttpClient client = new HttpClient();
    //Do some UI (we are still in the GUI thread)
    label1.Text = "Requesting the data . . .";
    var sw = Stopwatch.StartNew();
    //Wait for the result with a continuation
    var result = await client.GetAsync(url);
    //Continue on the GUI thread no matter what thread has been used in the task
    sw.Stop();
    label1.Text = "Finished in " + sw.ElapsedMilliseconds + ".";
}

The big advantage is that the code reads very similar to a sequential code, while being responsive and running concurrently. Once the task is finished the method is resuming (usually we might want to do some UI related modifications in those sections).

We can also express this scheme in a picture (who knew!):

Using await / async to toggle between UI bound code and running tasks.

This alone is already quite handy, but it gets much better. Until this point there is nothing that we could have solved easily with a few more characters. So wait for this: What if we still want to use try-catch for handling exceptions? Using the ContinueWith method does not work quite well in such a scenario. We would have to use a different kind of pattern to catch exceptions. Of course this would be more complicated and it would also result in more lines of code.
async Task ReadFromNetwork(string url)
{
    /* Same stuff as before */
    try
    {
        await client.GetAsync(url);
    }
    catch
    {
        label1.Text = "Request failed!";
        return;
    }
    finally
    {
        sw.Stop();
    }

    label1.Text = "Finished in " + sw.ElapsedMilliseconds + ".";        
}
 
This all works and is so close to sequential code that no one should still have excuses for non-responsive applications. Even old legacy methods are quite easy to wrap in a Task. Let's say one has a computationally expensive method that would otherwise run in a Thread, however, it is too complicated. Now we could do the following:
Task WrappedLegacyMethod()
{
    return Task.Run(MyOldLegacyMethod);
}
 
This is called async-over-sync. The opposite is also possible of course, i.e. sync-over-async. Here we just have to omit the await and call the Result property, as we've seen before.

What are the "keep-in-mind" things when thinking about awaiting tasks?
  1. async void should only be used with event handlers. Not returning a Task is an anti-pattern, that has only been made possible to allow the usage of await in event handlers, where the signature has been fixed. Returning void is a fire-and-forget mechanism, that does not trigger exceptions in try-catch-blocks and will usually result in faulty behavior.
  2. Using sync-over-async with an async function that switches to the UI will result in a deadlock, i.e. the UI is dead. This point is mostly important for people who want to develop APIs using async marked methods. Since the API will be UI-independent the context switch is unnecessary and a potential risk-factor. Avoid it by calling ConfigureAwait(false) on the awaited task.
  3. Spawning too many (parallel) tasks will result in an vast overhead.
  4. When using lambda expressions better check if you are actually returning a Task.

So what's the take-away from this section? C# makes programming robust (asynchronous) responsive applications nearly as easy as programming classic sequential (mostly non-responsive) applications.

 

Outlook

This concludes the third part of this tutorial series. In the next part we will have a look at powerful, yet lesser known (or used) features of C#. We will have a look on how to easily construct IEnumerable<T> instances and what co- and contra-variance means and how to use it. Additionally we will look closely at attributes and interop between native code (e.g. C or C++) and managed code in form of C#.

Another focus in the next tutorial will be on more efficient code, as well as cleaner code, e.g. using elegant compiler-based attributes for getting information on the source.

 

References





Comments

Popular posts from this blog

C# Snippet - Shuffling a Dictionary [Beginner]

Randomizing something can be a daunting task, especially with all the algorithms out there. However, sometimes you just need to shuffle things up, in a simple, yet effective manner. Today we are going to take a quick look at an easy and simple way to randomize a dictionary, which is most likely something that you may be using in a complex application. The tricky thing about ordering dictionaries is that...well they are not ordered to begin with. Typically they are a chaotic collection of key/value pairs. There is no first element or last element, just elements. This is why it is a little tricky to randomize them. Before we get started, we need to build a quick dictionary. For this tutorial, we will be doing an extremely simple string/int dictionary, but rest assured the steps we take can be used for any kind of dictionary you can come up with, no matter what object types you use. Dictionary < String , int > origin = new Dictionary < string , int >(); ...

C# WPF Printing Part 2 - Pagination [Intermediate]

About two weeks ago, we had a tutorial here at SOTC on the basics of printing in WPF . It covered the standard stuff, like popping the print dialog, and what you needed to do to print visuals (both created in XAML and on the fly). But really, that's barely scratching the surface - any decent printing system in pretty much any application needs to be able to do a lot more than that. So today, we are going to take one more baby step forward into the world of printing - we are going to take a look at pagination. The main class that we will need to do pagination is the DocumentPaginator . I mentioned this class very briefly in the previous tutorial, but only in the context of the printing methods on PrintDialog , PrintVisual (which we focused on last time) and PrintDocument (which we will be focusing on today). This PrintDocument function takes a DocumentPaginator to print - and this is why we need to create one. Unfortunately, making a DocumentPaginator is not as easy as...

C# WPF Tutorial - Implementing IScrollInfo [Advanced]

The ScrollViewer in WPF is pretty handy (and quite flexible) - especially when compared to what you had to work with in WinForms ( ScrollableControl ). 98% of the time, I can make the ScrollViewer do what I need it to for the given situation. Those other 2 percent, though, can get kind of hairy. Fortunately, WPF provides the IScrollInfo interface - which is what we will be talking about today. So what is IScrollInfo ? Well, it is a way to take over the logic behind scrolling, while still maintaining the look and feel of the standard ScrollViewer . Now, first off, why in the world would we want to do that? To answer that question, I'm going to take a an example from a tutorial that is over a year old now - Creating a Custom Panel Control . In that tutorial, we created our own custom WPF panel (that animated!). One of the issues with that panel though (and the WPF WrapPanel in general) is that you have to disable the horizontal scrollbar if you put the panel in a ScrollV...