C# Advanced Tutorial - An Advanced Introduction To C# [Beginner]

This tutorial aims to give a brief and advanced introduction into programming with C#. The prerequisites for understanding this tutorial are a working knowledge of programming, the C programming language and a little bit of basic mathematics. Some basic knowledge of C++ or Java could be helpful, but should not be required.

Basic concepts, object oriented programming and using the .NET-Framework

This is the first part of a series of tutorials on C#. In this part we are introducing the fundamental concepts of the language and it's output, the Microsoft Intermediate Language (MSIL). We will take a look at object-oriented programming (OOP) and what C# does to make OOP as efficient as possible to realize in practice.

For further reading a list of references will be given in the end. The references provide a deeper look at some of the topics discussed in this tutorial.

A little bit of warning before reading this tutorial: The way that this tutorial works is that it defines a path of usability. This part of the tutorial is essentially required before doing anything with C#. After this first tutorial everyone should be able to write a simple C# program using basic object-oriented principles. Hence more advanced or curious readers will probably miss some parts. As an example we will not investigate exception handling, exciting possibilities with the .NET-Framework like LINQ and more advanced language features like generics or lambda expressions. Here our aim is to give a soft introduction to C# for people coming from other languages.

The right development environment

Before we can start doing something we should think about where we want to do something. A logical choice for writing a program is a text editor. In case of C# there is an even better option: using the Microsoft Visual Studio (VS)! However, there are more options, that have been designed specifically with C# in mind. If we want to have a cross-platform IDE then MonoDevelop is a good choice. On Windows a free alternative to VS and MonoDevelop is SharpDevelop.

A short look at the Visual Studio 2012 IDE

Using a powerful IDE like Visual Studio will help us a lot when writing programs with the .NET-Framework. Features like IntelliSense (an intelligent version of auto-complete, which shows us the possible method calls, variables and available keywords for a certain code position), breakpoints (the program's execution can be paused and inspected at desired code positions) and graphical designers for UI development will help us a lot in programming efficiently.

Additionally the Visual Studio gives us integrated version control, an object browser and documentation tools. Another bonus for writing C# projects with the Visual Studio is the concept of solutions and projects. In Visual Studio one usually creates a solution for a development project. A solution file can contain several projects, which can be compiled to libraries (.dll) and executables (.exe files). The solution explorer enables us to efficiently manage even large scale projects.

The project files are used by the MSBuild application to compile all required files and link to the dependencies like libraries or the .NET-Framework. We, as a developer, do not need to worry anymore about writing makefiles. Instead we just add files and references (dependencies) to projects, which will be compiled and linked in the right order automatically.

There are several shortcuts / functions that will make VS a real pleasure:

CTRL + SPACE, which forces IntelliSense to open.
CTRL + ., which opens the menu if VS shows an option point (this will happen if a namespace is missing or if we rename a variable).
F5, to build and start debugging.
CTRL + F5, to build and execute without debugging.
F6, just build.
F10, to jump to the next line within the current function (step over) when debugging.
F11, to jump to the line in the next or current function (step into) when debugging.
F12, to go to the definition of the identifier at the caret's position.
SHIFT + F12, to find all references of the identifier at the caret's position.

Of course those keyboard shortcuts can be changed and are not required. Everything that can be done with shortcuts is also accessible by using the mouse. On the other hand there are more options and possibilities than shortcuts.

Where do get the Visual Studio? There are several options, some of them even free. Students enrolled in an university who is participating in the DreamSparks / MSDNAA program usually have the option of download Visual Studio (up to Ultimate) for free. Otherwise one can download public beta versions or language-bound specialized versions of the Visual Studio, called Express Edition.

In this tutorial we will only focus on console applications. GUI will be introduced in the next tutorial. To create a new console application project in VS we simple have to use the menu File: Then we select New, Project. In the dialog we select C# on the left side and then Console application on the right side. Finally we can give our project a name like SampleConsoleApp. That's it! We already created our first C# application.

Basic concepts

C# is a managed, static strong-typed language with a C like syntax and object-oriented features similar to Java. All in all one can say that C# is very close to Java to start with. There are some really great features in the current version of C#, but in this first tutorial we exclude them.

The managed means two things: First of all we do not need to care about the memory anymore. This means that people coming from C or C++ can stop worrying about freeing the memory allocated for their objects. We only create objects and do something with them. Once we stopped using them, a smart program called the Garbage Collector (GC) will take care of them. The next figure shows how the Garbage Collector works approximately. It will detect unreferenced objects, collect them and free the corresponding memory. We do not have much control about the point in time when this is happening. Additionally the GC will do some memory optimization, however, this is usually not done directly after freeing the memory.

This results in some overhead on the memory and performance side, but has the advantage that it is basically impossible to have segmentation faults in C#. However, memory leaks are still a problem if we keep references of objects that are no longer required.

The static strong-typed language means that the C# compiler needs to know the exact type of every variable and that the type-system must be coherent. There is no cast to a void datatype, which lets us basically do anything, however, we have a datatype called Object on top, which might result in similar problems. We will discuss the consequences of the Object base type later on. The strong part gives us a hint that operations may only be used if the operation is defined for the elements. There are no explicit casts happening without our knowledge.

We have another consequence of C# being managed: C# is not native, nor interpreted - it is something between. The compiler generates no assembly code, but the so called Microsoft Intermediate Language (MSIL). This trick saves us the re-compilations for different platforms. In case of C# we just compile once and obtain a so called Common Language Runtime (CLR) assembly. This assembly will be Just-In-Time (JIT) compiled during runtime. Another feature is that optimizations will also take place during runtime. Often occurring method calls will be in-lined and not required statements will be omitted automatically.

In theory this could result in (depending on the used references) platform-independent programs, however, this implies that all platforms have the requirements to start and JIT compile CLR assemblies. Right now Microsoft limits the .NET-Framework to the Windows family, however, Xamarin offers a product called Mono, which gives us a solution that also works on Linux and Mac.

Coming back to the language itself we will see that the object-oriented features are inspired by those of Java. In C# only a slightly different set of keywords has been used. The extend keyword of Java has been replaced by the C++ operator colon :. There are other areas where the colon has been used an operator with a different (but related) meaning.

Let's have a look at a sample Hello Tech.Pro code.

//Those are the namespaces that are (by default) included
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading.Tasks;

//By default Visual Studio creates already a namespace
namespace HelloWorld
{
    //Remember: Everything (variables, functions) has to be encapsulated
    class Program
    {
        //The entry point is called Main (capital M) and has to be static
        static void Main(string[] args)
        {
            //Writing something to the console is easily possible using the
            //class Console with the static method Write or WriteLine
            Console.WriteLine("Hello Tech.Pro!");
        }
    }
}

This looks quite similar to Java, except the casing. Comments can be placed by using slashes (comment goes until the end of the line with no possibility to switch back before), or using a slash and an asterisk character. In the latter case the comment has to be ended with the reverse, i.e. an asterisk and a slash. This kind of comment is called a block comment. Visual Studio also has a special mode of comments, that is triggered once three slashes are entered.

Back to the code: While we do not need to place our classes in a namespace, Visual Studio creates one by default. The creation of a class is, however, required. Every method and variable needs to be encapsulated. This is why C# is considered to be strongly-object-oriented. Data encapsulation is, as we will see, an important feature of object-oriented programming and therefore necessary in C#.

There are already some other things that we can learn from this small sample. First, C# uses (as it has to) a class called Console for console based interaction. This class only contains so called static members like the WriteLine function. A static member is such a member that can be accessed without creating an instance of the class, i.e. static members cannot be reproduced. They only exist once and exist without being explicitly created. Functions of a class are called methods.

The main entry point called Main can only exist once, which is why it has to be static. Another reason is that it has to live in a class. If it would not be static, the class would first have to be created (as all non-static members can only be accessed by created objects called instances of a class). Now we have a chicken-and-egg problem. How can we tell the program to build an instance of the class (such that the Main method can be accessed), without having a method where we start doing something? Hence the requirement on the Main method being static.

Another thing we can learn is that the parameter args is obviously an array. It seems that an array is used as a datatype, whereas in C the variable is the array and the type would be string. This is by design and has some useful implications. We will later see that every array is based on an object-oriented principle called inheritance, which specifies a basic set of fields and implementations. Every array type has a field called Length, which contains the number of elements in the array. This is no longer hidden and needs to be passed as an additional parameter, which is why the standard Main method of a C# program only has 1 parameter compared to the standard C main method with 2 parameters.

Namespaces

Namespaces might be a new concept for people coming from the C programming language. Namespaces try to bring order to the world of types. Even though C# allows us to have multiple methods with the same name (called method overloading), but a unique signature (return type and parameters), it does not allow us to use the same name for a type.

This restriction could result in serious problems. If we consider the case of using (requiring) two (independent) internal libraries, it is possible that both libraries define a type with the same name. Now there would be no way to use both types (if compiling would be possible at all!). At best we could use one type.

This is where namespaces come to rescue. A namespace is like a container where we can place types in. However, that container is only a string that will be used by the compiler to distinguish between types. Even though the dot (like in a.b) is usually used to create the impression of an relation between two strings (in this case a and b), there is no restriction that a namespace has to exist before creating another one it it, e.g. a has not to be defined or used somewhere before using a.b.

Usually we would have to place that namespace in front of every type always, however, using C#'s using keyword we tell the compiler to implicitly do that for all types of the used namespace. Types of the current namespace are also implicitly used by the compiler. There is only one exception i.e., if we want to use a type that is defined in multiple used namespaces. In that scenario, we always have to explicitly specify the namespace of the type we want to use.

Datatypes and operators

Before we can actually start doing something we need to introduce the basic set of datatypes and operators. As already mentioned C# is a static strongly-typed language, which is why we need to care about datatypes. In the end we will have to specify what's the type behind any variable.

There is a set of elementary datatypes, which contains the following types:

bool (1 byte) used for logical expressions (true, false)
char (2 bytes) used for a single (unicode) character
short (2 bytes) used for storing short integers
int (4 bytes) used for computations with integers
long (8 bytes) used for computations with long integers
float (4 bytes) used for computations with single precision floating point numbers
double (8 bytes) used for computations with double precision floating point numbers
decimal (16 bytes) used for computations with fixed precision floating point numbers

There are also modified datatypes like unsigned integer (uint), unsigned long (ulong) and many more available. Additionally a really well working class for using strings has been delivered. The class is just called string and works as one would expect, except that a lot of useful helpers are directly available.

Types alone are quite boring, since we can only instantiate them, i.e. create objects based on their definition. It gets more interesting once we connect them by using operators. In C# we have the same set of operators (and more) as in C:

Logical operators like ==, !=, <, <=, >, >=, !, &&, ||
Bit operators like ^, &, |, <<, >>
Arithmetic operators like +, -, *, /, %, ++, --
The assignment operator = and combinations of the assignment operator with the binary operators
The ternary operator ? : (inline condition) to return either the left or right side of the colon depending on a condition specified before the question mark
Brackets () to change the operator hierarchy

Additionally C# has some inbuilt-methods (defined as unary operators) like typeof or sizeof and a set of really useful type operators:

The standard cast operator () as in C
The reference cast operator as
The type conformance checker is
The null-coalescing operator ??
The inheritance operator :

Let's see some of those types and operators in action:

using System;

class Program
{
    static void Main(string[] args)
    {
        //5 here is an integer literal
        int a = 5;

        //A double value is a floating point literal.
        double x = 2.5 + 3.7;
        //A single value is given by a floating point literal with the suffix f.
        float y = 3.1f;

        //Using quotes will automatically generated constant strings
        string someLiteral = "This is a string!";
        string input = Console.ReadLine();

        //int, string, double, float, ... are just keywords
        //by Int32, String, Double, Single - which have static & and non-static members.
        a = int.Parse(input);//Here we use the static method of Int32.

        //Output of a mod operation - adding string and something (or something

        //and string) will always result in a string. Additionally we have the

        //regular operator ordering which performs a % 10 before the string and

        //a get concatenated.
        Console.WriteLine("a % 10 = " + a % 10);
    }
}

We will see more operators and basic types in action throughout this series of tutorials. It should be noted that all operators will by default return a new type, leaving the original variable unchanged. Another important aspect is there is a fixed operator hierarchy, which is basically like the operator hierarchy in C. We do not need to learn this, since we can always use brackets to change the default hierarchy. Additionally the operator hierarchy is quite natural in following rules like dot before dash, i.e. multiplication and division before addition and subtraction.

Reference and value types

A very important concept for understanding C# is the difference between reference and value types. Once we use a class, we are using a reference type. On the other side any object that is a struct will be handled as a value type. The important difference is that reference types will be passed by reference, i.e. we will only pass a copy of the position of the object, and not a copy of the object itself. Any modification on the object will result in a modification on the original object.

The other case is the one with value types. If we give a method an argument that is a struct (that includes integers, floating point numbers, a boolean, a character and more) we will get a copy of the passed object. This means that modifications on the argument will never result in a modification on the original object.

The whole concept is quite similar to the the concept of pointers in C. The only difference is that we do not actually have access to the address and we do not have to dereference the variable to get access to the values behind it. One could say that C# handles some of the things automatically that we would have to write by hand in C/C++.

In C# we have access to two keywords that can be passed together with the argument definition. One important keyword is called ref. This allows us to access the original value in the case of passing in structures. For classes the consequence is that the original pointer is actually passed in, allowing us to change the position of it. This sounds quite strange at first, but we will see that in most cases there is no difference between an explicit ref on a class instance to an implicit by just passing the class. The only difference is that we can actually reset the pointer, like in the following code.

using System;

class Program
{
    static void Main()
    {
        //A string is a class, which can be instantiated like this
        string s = "Hi there";
        Console.WriteLine(s);//Hi there
        ChangeString(s);
        Console.WriteLine(s);//Hi there
        ChangeString(ref s);
        Console.WriteLine(s);//s is now a null reference
    }

    static void ChangeString(string str)
    {
        str = null;
    }

    static void ChangeString(ref string str)
    {
        str = null;
    }
}

The other important keyword is called out. Basically out is like ref. There are a few differences that are worth mentioning:

out variables need to be assigned in the method that has them as parameters.
out variables do not need to be assigned in the method that passes them as parameters.
out variables mark the variable as being used for (additional) outgoing information.

The main usage of out parameters is as the name suggests: We now have an option to distinguish between "just" references and parameters, which will actually return something. The .NET-Framework uses out parameters in some scenarios. Some of those scenarios are the TryParse methods found on the basic structures like int, double and others. Here a bool is returned, giving us an indicator if the given string can be converted. In case of a successful conversion the variable given in form of an out parameter will be set to the corresponding value.

All the talk about reference (class) and value (struct) types is useless if we do not know how to create such types. Let's have a look at a simple example:

using System;

class Program
{
    static void Main(string[] args)
    {
        //Usually C# forbids us to leave variables unitialized
        SampleClass sampleClass;
        SampleStruct sampleStruct;

        //However, C# thinks that an out-Function will do the initialization
        HaveALook(out sampleClass);
        HaveALook(out sampleStruct);
    }

    static void HaveALook(out SampleClass c)
    {
        //Insert a breakpoint here to see the
        //value of s before the assignment:
        //It will be null...
        c = new SampleClass();
    }

    static void HaveALook(out SampleStruct s)
    {
        //Insert a breakpoint here to see the
        //value of s before the assignment:
        //It will NOT be null...
        s = new SampleStruct();
    }

    //In C# you can created nested classes
    class SampleClass
    {
    }

    //A structure always inherits ONLY from object,
    //we cannot specify other classes (more on that later)
    //However, an arbitrary number of interfaces can
    //be implemented (more on that later as well)
    struct SampleStruct
    {
    }
}

In the example above we are creating two types called SampleClass and SampleStruct. We can instantiate new objects from those type definitions using the new keyword. This is not new (pun intended) for programmers coming from Java or C++, but certainly something new for programmers coming from C. In C we would use the malloc function in case of a class (giving us a pointer) and nothing with a structure (giving us a value). There is, however, one big advantage of using that new keyword: It will not only do the right memory allocation (on the heap in case of a reference type, or the stack in case of a value type), but also call the corresponding constructor. We will later see what a constructor is, and what kind of benefits it gives us.

Coming back to our example we see that we do not instantiate anything with the new keyword in the Main method, however, we do use it in the methods called HaveALook, which differ by the parameter type they expect. Using breakpoints in those methods we can see that the class variable is actually NOT set (the passed in value of the variable is null, which is the constant for a pointer that is not set), while the structure has already some value.

Control flow

Now that we introduced the basic concepts behind C#, as well as elementary datatypes and available operators, we just need one more thing before we can actually go ahead and write actual C# programs. We need to know how to control the program flow, i.e. how to introduce conditions and loops.

This is pretty much the same as in C, as we are dealing with a C-style syntax. That being said we have to follow these rules:

Conditions can be introduced by using if or switch.
A loop is possible by using for, while, do-while.
A loop will be stopped by using the break keyword (will stop the most inner loop).
A loop can skip the rest and return to the condition with the continue keyword.
C# also has an iterator loop called foreach.
Another possibility is the infamous goto statement - we will not discuss this.
There are other ways of controlling the program flow, but we will introduce those later.

The next program code will introduce a few of the mentioned possibilities.

using System;

class Program
{
    static void Main(string[] args)
    {
        //We get some user input with the ReadLine() method
        string input = Console.ReadLine();

        //Let's see if this is empty or not
        if(input == "")
        {
            Console.WriteLine("The input is empty!");
        }
        else
        {
            //A string is a char array and has a Length property
            //It also can be accessed like a char array, giving
            //us single chars.
            for(int i = 0; i < input.Length; i++)
            {
                switch(input[i])
                {
                    case 'a':
                        Console.Write("An a - and not ... ");
                        //This is always required, we cannot just fall through.
                        goto case 'z';
                    case 'z':
                        Console.WriteLine("A z!");
                        break;
                    default:
                        Console.WriteLine("Whatever ...");
                        //This is required even in the last block
                        break;
                }
            }
        }
    }
}

The iterator loop called foreach is not available in C. It is possible to use foreach with every type that defines some kind of iterator. We will see what this means later on. For now we only have to know that every array already defines such an iterator. The following code snippet will use the foreach-loop to output each element of an array.

//Creating an array is possible by just appending [] to any datatype
int[] myints = new int[4];
//This is now a fixed array with 4 elements. Arrays in C# are 0-based
//hence the first element has index 0.
myints[0] = 2;
myints[1] = 3;
myints[2] = 17;
myints[3] = 24;

//This foreach construct is new in C# (compared to C++):
foreach(int myint in myints)
{
    //Write will not start a new line after the string.
    Console.Write("The element is given by ");
    //WriteLine will do that.
    Console.WriteLine(myint);
}

There are some restrictions on foreach as compared to for. First of all, it is not as efficient as a for-loop, taking one more operation to start the loop, and calling always the iterators Next method at the end of each iteration. Second we cannot change the current element. This is due to the fact that foreach operates on iterators, which are in general immutable, i.e. a single element cannot be changed. This is also required to keep the iteration consistent.

Object-oriented programming

Object-oriented programming is a method of focusing around objects instead of functions. Therefore the declaration of types is a key aspect of object-oriented programming. Everything has to be part of a type, even if it is just static without any instance dependency.

There are downsides of this pattern of course. Instead of writing sin(), cos(), sign() etc. we have to write Math.Sin(), Math.Cos() and Math.Sign() since the (very helpful) math functions need to be inside a type (in this case the class Math) as well.

So what are the key aspects of object-oriented programming?

Data encapsulation
Inheritance
Relations between types
Declaring dependencies
Maintainability
Readability

By creating classes to carry large, reusable packages of data we provide encapsulation. The inheritance process helps us mark a strong relation between types and reuse the same basic structure. Encapsulating functions in types will group what belongs together and reduce code by omitting required parameters. Also misusage will be prevented by default. All in all, the main goal is too reduce maintenance efforts by improving readability and increasing the compiler's power in error detection.

The main concept for OOP is the type-focus. The central type is certainly a class. Structures are also important, but will only be used in edge cases. Structures make sense if we have only a small payload, or want to create quite elementary small types that will have stronger immutable features than classes.

Let's have a look again at how we create a class (the type) and how we create class objects (instances):

//Create a class definition
class MyClass
{
    public void Write(string name)
    {
        Console.WriteLine("Hi there... {0}!", name);
    }
}

//Create a class instance (code to be placed in a method)
MyClass instance = new MyClass();

A class makes sense once we want to reuse a set of methods with a fixed set of variables additionally to some parameters for those methods. A class is also very useful once we want to use an already existing set of variables and / or methods. If we just want a collection of functions that is unrelated to any set of fixed (called instance dependent) variables, then we create static classes where we can just insert static methods and variables. Good examples of such static classes are the Console and Math class. They cannot be instantiated (instances of static classes, i.e. classes that do not contain instance dependent code, do not make any sense) and provide only functions with a set of parameters.

Inheritance and polymorphism

Now we are coming to the inheritance issue. To simplify things we can think of inheritance as a recursive copy paste process by the compiler. All members of the parent (base) class will be copied.

class MySubClass : MyClass
{
    public void WriteMore()
    {
        Console.WriteLine("Hi again!");
    }
}

As already mentioned the inheritance operator is the :. In this example we create a new type called MySubClass, which inherits from MyClass. MyClass has been defined in the previous section and does not define an explicit inheritance. Therefore MyClass inherits from Object. Object itself just defines four methods, that are:

ToString, which is a very comfortable way of defining how an instance of the type should presented as a string.
Equals, which is a generic way of comparing two arbitrary objects of equality.
GetHashCode, which gets a numeric indicator if two objects could be equal.
GetType, which gets the meta-information on the specific type of the current instance.

These four methods are available at MyClass and MySubClass instances (copy paste!). Additionally MyClass defines a method called Write, which will also be available for all MySubClass instances. Finally MySubClass defines a method called WriteMore, which will only be available for MySubClass instances.

Right now the inheritance concept is already a little bit useful, but it is not very powerful. The concept of polymorphism will enable us to specialize objects using inheritance. First we will introduce the virtual keyword. This keyword lets us specify that a (virtual marked) method can be re-implemented by more specialized (or derived) classes.

class MyClass
{
    //This method is now marked as re-implementable
    public virtual void Write(string name)
    {
        //Using a placeholder in a string.Format()-able method
        Console.WriteLine("Hi {0} from MyClass!", name);
    }
}

If we now want to re-implement the Write method in the MySubClass class, then we have to do that explicitly by marking the re-implementation as override. Let's have a look:

class MySubClass : MyClass
{
    //This method is now marked as re-implemented
    public override void Write(string name)
    {
        Console.WriteLine("Hi {0} from MySubClass!", name);
    }
}

What is the great benefit of this? Let's check out some example code snippet:

//We create two variables of type MyClass
MyClass a = new MyClass();
MyClass b = new MySubClass();

//Now we call the Write() method on each of them, the only
//difference being that in fact b is a more specialized type
a.Write("Flo"); //Outputs ... from MyClass!
b.Write("Flo"); //Outputs ... from MySubClass!

So the trick is that without knowing about the more specialized instance behind it, we are able to access to specialized implementation available in mySubClass. This is called polymorphism and basically states that classes can re-implement certain methods, which can then be used again without knowing about the specialization or re-implementation at all.

Already here we can benefit from polymorphism, since we are able to override the four methods given by Object. Let's consider the following example:

using System;

class Program
{
    static void Main(string[] args)
    {
        MyClassOne one = new MyClassOne();
        MyClassTwo two = new MyClassTwo();

        Console.WriteLine(one);//Displays a strange string that is basically the type's name
        Console.WriteLine(two);//Displays "This is my own class output"
    }
}

class MyClassOne
{
    /* Here we do not override anything */
}

class MyClassTwo
{
    public override string ToString()
    {
        return "This is my own class output";
    }
}

Here the method WriteLine solves the problem of having to display any input as a sequence of characters by using the ToString method of Object. This enables WriteLine to output any object, even objects that are unknown. Everything that WriteLine cares about is that the given argument is actually an instance of Object (that applies to every object in C#), which means that the argument has a ToString method. Finally the specific ToString method of the argument is called.

Access modifiers

Access modifiers play an important rule in forcing programmers to apply to a given object-oriented design. They hide members to prevent undefined access, define which members take part in the inheritance process and what objects are visible outside of a library.

Right here we already have to note that all restrictions placed by modifiers are only artificial. The compiler is the only protector of those rules. This means that those rules will not prevent unauthorized access to e.g. a variable during runtime. Therefore setting access modifiers to spawn some kind of security system is certainly a really bad idea. The main idea behind those modifiers is the same as with object-oriented programming: Creating classes that encapsulate data and force other programmers in a certain pattern of access. This way, finding the right way of using certain objects should be simpler and more straight forward.

C# knows a whole bunch of such modifier keywords. Let's have a look at them with a short description:

private, declares that a member is neither visible from outside the object, nor does it take part in the inheritance process.
protected, declares that a member is not visible from outside the object, however, the member takes part in the inheritance process.
internal, declares that a member or type is visible outside the object, but not outside the current library.
internal protected, has the meaning of internal OR protected.
public, declares that a member or type is visible everywhere.

Most of the time we can specify the modifier (there are some exceptions to this rule, as we will see later), however, we can also always omit it. For types directly placed in a namespace, the default modifier is internal. This makes quite some sense. For types and members placed in a type (like a class or structure), the default modifier is private.

This makes sense since it is just a best practice from C++: We always should have started with a private declaration in C++, otherwise every member would have been public (this has been one of the bad design decisions of C++). So while C++ took the way of taking the weakest access modifier as standard (public), C# always uses the strongest one (internal or private).

using System;

//No modifier, i.e. the class Program is internal
class Program
{
    //No modifier, i.e. the method Main() is private
    static void Main()
    {
        MyClass c = new MyClass();
        //Works
        int num = c.WhatNumber();
        //Does not work
        //int num = c.RightNumber();
    }
}

//MyClass is visible from this library and other libraries
public class MyClass
{
    //This one can only be accessed from MyClass
    private int a;

    //Classes inheriting from MyClass can access b like MyClass can
    protected int b;

    //No modifier, i.e. the method RightNumber() is private
    int RightNumber()
    {
        return a;
    }

    //This will be seen from the outside
    public int WhatNumber()
    {
        //Access inside the class is possible
        return RightNumber();
    }
}

//MySubClass is only visible from this library
internal class MySubClass : MyClass
{
    int AnotherRightNumber()
    {
        //Works
        b = 8;
        //Does not work - a cannot be accessed since it is private
        return a;
    }
}

There are some restrictions that will be enforced by the compiler. The reverse case of the example above, where we set MyClass internal and MySubClass public is not possible. The compiler detects, that having MySubClass visible to the outside must require MyClass to also be visible to the outside. Otherwise we have a specialization of a type where the basic type is unknown.

The same is true in general, like when we return an instance of a type that is internal in a method that is visible to the outside (public with the type being public). In this case the compiler will also tell us that the type that is returned has a stronger access modifier set.

In C# every non-static method has access to the class instance pointer variable this. This variable is treated like a keyword and points to the current class instance. Usually the keyword can be omitted before calling methods of the class instance, however, there are multiple scenarios where the this is very useful.

One of those scenarios is to distinguish between local and global variables. Consider the following example:

class MyClass
{
    string str;

    public void Change(string str)
    {
        //Here this.str is the global variable str and
        //str is the local (passed as parameter) variable
        this.str = str;
    }
}

Since methods marked as static are independent of instances, we cannot use the this keyword. Additionally to the this pointer there is also a base pointer, which gives us access to all (for the derived class accessible) members of the base class instance. This way it is possible to call already re-implemented or hidden methods.

class MySubClass : MyClass
{
    public override void Write(string name)
    {
        //First we want to use the original implementation
        base.Write(name);
        //Then our own
        Console.WriteLine("Hi {0} from MySubClass!", name);
    }
}

In the example, we are accessing the original implementation of the Write method from the re-implementation.

Properties

People coming from C++ will know the problem of restricting access to variables of a class. In generally one should never expose variables of a class, such that other classes could change it without the class being notified. Therefore the following piece of code was written quite often in C++ (code given in C#):

private int myVariable;

public int GetMyVariable()
{
    return myVariable;
}

public void SetMyVariable(int value)
{
    myVariable = value;
}

This is a clean code and we (as developers) now have the possibility to react to variable's external changes by inserting some lines of code before myVariable = value. The problem with this code is that

we really only want to show that this is just a wrapper around myVariable and that
we need to write too much code for this simple pattern.

Therefore the C# team introduced a new language feature called properties. Using properties the code above boils down to:

private int myVariable;

public int MyVariable
{
    get { return myVariable; }
    set { myVariable = value; }
}

This looks much cleaner now. Also the access changed. Before we accessed myVariable like a method (using a = GetMyVariable() or SetMyVariable(b)), but now we access myVariable like a variable (using a = MyVariable or MyVariable = b). This is more like the programmer's original intention and saves us some lines of code.

Internally the compiler will still create those (get / set) methods, but we do not care about this. We will just use properties with either a get block, a set block, or both, and everything will work.

The constructor

The constructor is a special kind of method that can only be called implicitly and never explicitly. A constructor is automatically called when we allocate memory with the new keyword. In perfect alignment with standard methods, we can overload the constructor by having multiple definitions that differ by their parameters.

Every class (and structure) has at least one constructor. If we did not write one (until now we did not), then the compiler places a standard (no parameters, empty body) constructor. Once we define one constructor, the compiler does not insert a default constructor.

The signature of a constructor is special. It has no return type, since it does implicitly return the new instance, i.e. an instance of the class. Also a constructor is defined by its name, which is the same name as the class. Let's have a look at some constructors:

class MyClass
{
    public MyClass()
    {
        //Empty default constructor
    }

    public MyClass(int a)
    {
        //Constructor with one argument
    }

    public MyClass(int a, string b)
    {
        //Constructor with two arguments
    }

    public MyClass(int a, int b)
    {
        //Another constructor with two arguments
    }
}

This looks quite straight forward. In short, a constructor is a method with the name of the class that specifies no return value. Using the various constructors is possible when instantiating an object of the class.

MyClass a = new MyClass();//Uses the default constructor
MyClass b = new MyClass(2);//Uses the constructor with 1 argument
MyClass c = new MyClass(2, "a");//Uses the constructor with 2 arguments
MyClass d = new MyClass(2, 3);//Uses the other constructor with 2 arguments

Of course it could be that one constructor would need to do the same work as another constructor. In this case it seems like we only have two options:

Copy & paste the content.
Extracting the content into a method, which is then called by both constructors.

The first is a simple no-go after the DRY (Don't Repeat Yourself) principle. The second one is maybe also not fine, since this could result in the method being abused on other locations. Therefore C# introduces the concept of chaining constructors. Before we actually execute instructions from one constructor, we call another constructor. The syntax relies on the colon : and the current class instance pointer this:

class MyClass
{
    public MyClass()
        : this(1, 1) //This calls the constructor with 2 arguments
    {
    }

    public MyClass(int a, int b)
    {
        //Do something with a and b
    }
}

Here the default constructor uses the constructor with 2 parameters to do some initialization work. The initialization work is the most popular use-case of a constructor. A constructor should be a lightweight method that does some preprocessing / setup / variable initialization.

The colon operator for the constructor chaining is used for a reason. Like with inheritance every constructor has to call another constructor. If no call is specified (i.e. no previous constructor), then the called constructor is the default constructor of the base class. Therefore the second constructor in the previous example does actually look like the following:

public MyClass(int a, int b)
    : base()
{
    //Do something with a and b
}

The additional line is, however, redundant, since the compiler will automatically insert this. There are only two cases where we have to specify the base constructor for the constructor chaining:

When we actually want to call another base constructor than the default constructor of the base class.
and When there is no default constructor of the base class.

The reason for the constructor chaining with the base class constructor is illustrated in the next figure.

We see that in this class hierarchy in order to create an instance of Porsche, an instance of Car has to be created. This creation, however, requires the creation of an instance of a Vehicle, which requires the instantiation of Object. Each instantiation is associated with calling a constructor, which has to be specified. The C# compiler will automatically call the empty constructor, but this is only possible in case such a constructor exists. Otherwise we have to tell the compiler explicitly what to call.

There are also cases where other access modifiers for a constructor might make sense. If we want to prevent instantiation of a certain type (like with abstract), we could create one default constructor and make it protected. On the other hand the following is a simple so-called Singleton pattern:

class MyClass
{
    private static MyClass instance;

    private MyClass() { }

    public static MyClass Instance
    {
        get 
        {
            if(instance == null)
                instance = new MyClass();

            return instance;
        }
    }
}

Now we cannot create instances of the class, but we can access the static property Instance by using MyClass.Instance. This property not only has access to the static variable instance, but also has access to all private members like the private constructor. Therefore, it can create an instance and return the created instance.

This implementation has two main advantages:

Because the instance is created inside the Instance property method, the class can exercise additional functionality (for example, instantiating a subclass), even though it may introduce unwelcome dependencies.
The instantiation is not performed until an object asks for an instance. This approach is referred to as lazy instantiation. This avoids instantiating unnecessary singletons when the application starts.

We will not discuss other design patterns in this series of tutorials.

Abstract classes and interfaces

There is one more thing we need to discuss in this tutorial. Sometimes we want to create classes that should just be a sketch for some more specialized implementations. This is like creating a template for classes. We do not want to use the template directly (instantiate it), but we want to derive from the class (use the template), which should save us some time. The keyword for marking a class as being a template is abstract. Abstract classes cannot be instantiated, but can be used as types of course. Such a class can also mark members as being abstract. This will require derived classes to deliver the implementation:

abstract class MyClass
{
    public abstract void Write(string name);
}

class MySubClass : MyClass
{
    public override void Write(string name)
    {
        Console.WriteLine("Hi {0} from an implementation of Write!", name);
    }
}

Here we mark the Write method as being abstract, which has two consequences:

There is no method body (the curly brackets are missing) in the first method definition.
MySubClass is required to override the Write method (or in general: all methods that are marked abstract and not implemented yet).

Also the following code will fail, since we create an instance of MyClass, which is now marked as being abstract.

//Ouch, MyClass is abstract!
MyClass a = new MyClass();
//This works fine, MyClass can still be used as a type
MyClass b = new MySubClass();

An important restriction in doing OOP with C# is the limitation to inheritance from one class only. If we do not specify a base class then Object will be used implicitly otherwise the explicitly specified class will be used. The restriction to one class in the inheritance process makes sense, since it keeps everything well-defined and prohibits weird edge cases. There is an elegant way around this limitation, which builds upon using so called interface types.

An interface is like a code-contract. Interfaces define which functionalities should be provided by the classes or structures that implement them, but they do not say any word about how the exact function looks like. That being said we can think of those interfaces as abstract classes without variables and with only abstract members (methods, properties, ...).

Let's define a very simple interface:

interface MyInterface
{
    void DoSomething();

    string GetSomething(int number);
}

The defined interface contains two methods called DoSomething and GetSomething. The definitions of these methods look very similar to the definitions of abstract methods, except we are missing the keywords public and abstract. This is by design. The idea is that since every member of an interface is abstract (or to be more precise: misses an implementation), the keyword is redundant. Another feature is that every method is automatically being treated as public.

Implementing an interface is possible by using the same syntax as with classes. Let's consider two examples:

class MyOtherClass : MyInterface
{
    public void DoSomething()
    { }

    public string GetSomething(int number)
    {
        return number.ToString();
    }
}

class MySubSubClass : MySubClass, MyInterface
{
    public void DoSomething()
    { }

    public string GetSomething(int number)
    {
        return number.ToString();
    }       
}

This snippet should demonstrate a few things:

It is possible to implement only one interface and no class (this will result in inheriting directly from Object)
We can also implement one ore more interfaces and additionally a class (explicit inheritance)
We always have to implement all methods of the "inherited" interface(s)
Also as a side note we do not need to re-implement Write method on the MySubSubClass, since MySubClass already implements this

It should be clear that we cannot instantiate interfaces (they are like abstract classes), but we can use them as types. Therefore it would be possible to do the following:

MyInterface myif = new MySubSubClass();

Usually interface types start with a big I in the .NET-Framework. This is a useful convention to recognize interfaces immediately. In our journey, we will discover some useful interfaces that are quite important for the .NET-Framework. Some of these interfaces are used by C# implicitly.

Interfaces also gives us another option for implementing their methods. Since we can implement multiple interfaces, it is possible that two methods with the same name and signature will be included. In this case there must be a way to distinguish between the different implementations. This is possible by a so-called explicit implementation. An explicitly implemented interface will not contribute to the class directly. Instead one has to cast the class to the specific interface type in order to access the members of the interface.

Here is an explicit implementation:

class MySubSubClass : MySubClass, MyInterface
{
    //Explicit (no public and MyInterface in front)
    void MyInterface.DoSomething()
    { }

    //Explicit (no public and MyInterface in front)
    string MyInterface.GetSomething(int number)
    {
        return number.ToString();
    }       
}

Explicit and implicit implementations of definitions from an interface can be mixed. Hence we can only be sure to get access to all members defined by an interface, if we cast an instance to that interface.

Exception handling

There are many things that have been designed with OOP in mind in C#. One of those things is exception handling. Every exception has to derive from the Exception class, which has been placed in the System namespace. In general we should always try to avoid exceptions, however, there are cases where an exception could easily happen. One such example is found in communication with the file system. Here we are talking to the OS, which sometimes has no other choice than to throw an exception. There could be various reasons, e.g.:

The given path is invalid.
The file cannot be found.
We do not have sufficient rights to access the files.
The file is corrupt and cannot be read.

Of course the OS API could just return a pseudo file or pseudo content and everything would work. The problem with such a handling is that this does not represent reality, and we would have no way to detect that obviously something went wrong. Another option would be to return an error code, but this would result in a C like API and it would leave the handling to the programmer. If the programmer now would do a bad job (like ignoring the returned error code), the user would never see that something went wrong.

Here is where exceptions come into play. The important thing about an exception is that once an exception is possible, we should think about handling it. In order to handle such an exception, we need a way to react to it. The construct is the same as in C++ or Java: Thrown exceptions can be caught.

try
{
    FunctionWhichMightThrowException();
}
catch
{
    //React to it
}

In the example, we call a method named FunctionWhichMightThrowException. Calling this method might result in an exception, which is why we put it in a try-block. The catch-block is only entered if an exception is thrown, otherwise it will be ignored. What this example is not capable of doing is reacting to the specific exception. Right now we just react to any exception, without touching the exception that has been thrown. This is, however, very important and should therefore be done:

try
{
    FunctionWhichMightThrowException();
}
catch(Exception ex)
{
    //React to it e.g.
    Console.WriteLine(ex.Message);
}

Since every exception has to derive from Exception, this will always work and we will always be able to access to the property Message. This is a so called catch'em all block. Sometimes, however, we want to distinguish between the various exceptions. Coming back to our example with the file system above, we can expect that every unique scenario (e.g. path invalid, file not found, insufficient rights, ...) will throw a different kind of exception. We could differentiate between those exceptions by defining more catch-blocks:

byte[] content = null;

try
{
    content = File.ReadAllBytes(/* ... */);
}
catch (PathTooLongException)
{
    //React if the path is too long
}
catch (FileNotFoundException)
{
    //React if the file has not been found
}
catch (UnauthorizedAccessException)
{
    //React if we have insufficient rights
}
catch (IOException)
{
    //React to a general IO exception
}
catch (Exception)
{
    //React to any exception that is not yet handled
}

There should be two lessons from this example.

We can specify multiple catch-blocks, each with its own handling. The only limitation is that we should specify it in such a order, that the most general exception is called last, while the most specific is first.
We do not need to name the variable of the exception. If we name it we will get access to the Exception object, but sometimes we do not care about the specific object. Instead we just want to differentiate between the various exceptions.

Now that we can catch those nasty exceptions, we may want to throw exceptions ourselves. This is done by using the throw keyword. Let's see some sample code:

void MyBuggyMethod()
{
    Console.WriteLine("Entering my method");
    throw new Exception("This is my exception");
    Console.WriteLine("Leaving my method");
}

If we call this method we will see that the second WriteLine method will not be called. Once an exception is thrown the method is left immediately. This goes on until a suitable try-catch-block is wrapping the method call. If no such block is found then the application will crash. This behavior is called bubbling. Alternatively we could have also written our own class that derives from Exception:

class MyException : Exception
{
    public MyException
        : this("This is my exception")
    {
    }
}

Now our code above could have been changed to become the following:

void MyBuggyMethod()
{
    Console.WriteLine("Entering my method");
    throw new MyException();
    Console.WriteLine("Leaving my method");
}

Coming back again to our example that plays around with the file system. In this scenario we might end up with some open file handle. Therefore, whether we get some exception or not, we want to close that handle to clean up the open resources. In this scenario another block would be very helpful. A block that performs a final action that does not depend on the actions in the try or any catch block. Of course such a block exists and is called a finally-block.

FileStream fs = null;

try
{
    fs = new FileStream("Path to the file", FileMode.Open);
    /* ... */
}
catch(Exception)
{
    Console.WriteLine("An exception occurred.");
    return;
}
finally
{
    if(fs != null)
        fs.Close();
}

Here we should note that return in one block will still call the code in the finally-block. So in total we have the option of using a try-catch, a try-catch-finally or a try-finally block. The last one will not catch the exception (i.e. the exception will bubble up), but still invoke the code that is given in the finally-block (no matter what happens in the try-block).

Outlook

In the next tutorial we will learn about more advanced features in C# and extend our knowledge in object-oriented programming. With our knowledge in C# improving, we are ready to dive more into the .NET-Framework.

C# 4 All

Search This Blog