Skip to main content

C# Snippet Tutorial - Using StringBuilder [Beginner]


Everyone loves strings, you use them all the time in writing code (especially if you are writing any code that needs to interact with the user). And C# makes it really easy to work with strings - the useful operators just work (like '+' and '+=') and every object has the extremely useful ToString function call. In fact, strings are so easy to work with in C#, one might almost say that they are too easy to work with.

Why would someone say that? Well, because of some of the underlying performance characteristics of the String class. Did you know that every time you manipulate a string in C#, you get a new string object back? This isn't that bad when dealing with short strings, or a short number of operations per string, but when you start concatenating a bunch of strings together it can get out of hand quickly. For instance, take a look at the following code:
string[] myKeywordArray = new string[100];

//
//myKeywordArray get populated
//

string userOutput = "Keywords: ";
for (int i = 0; i < myKeywordArray.Length - 1; i++)
  userOutput += myKeywordArray[i] + ", ";
userOutput += myKeywordArray[myKeywordArray.Length - 1];

//
//userOutput is given to user
//
 
There are a large number of string concatenations going on here (about 200) - and every time that a concatenation occurs, a new string is allocated and the data is copied. While a loop like this won't kill a program (in fact, a loop like this will probably complete in no time at all), you can see how easy it is to start racking up the string allocations.

But never fear! StringBuilder is here! The System.Text.StringBuilder class is exactly what you want to use in this type of situation. When you have a string that you are going to be doing a bunch of concatenation operations on (or other operations like insert, replace or remove), you want to use a StringBuilder instance. The StringBuilder class is optimized for these sorts of operations - it does not do any sort of the allocation and copying of data that the standard String class does. What it does is keep a buffer, and when that buffer is used up (because of data being appended to the string), it allocates some more buffer space.

Lets take a look at what the above code would look like using a StringBuilder:
string[] myKeywordArray = new string[100];

//
//myKeywordArray get populated
//

StringBuilder outputBuilder = new StringBuilder();
for (int i = 0; i < myKeywordArray.Length - 1; i++)
{
  outputBuilder.Append(myKeywordArray[i]);
  outputBuilder.Append(", ");
}
outputBuilder.Append(myKeywordArray[myKeywordArray.Length - 1]);

string userOutput = outputBuilder.ToString();

//
//userOutput is given to user
//
 
Nothing real surprising here. Instead of concatenating strings together, we are calling Append on the StringBuilding. Once everything is done, we tell the outputBuilder to hand us a string by calling ToString. The StringBuilder class even has an AppendFormat call, which acts exactly like the String.Format call. So you could potentially replace these two lines:
outputBuilder.Append(myKeywordArray[i]);
outputBuilder.Append(", ");
 
With a single line like this:
outputBuilder.AppendFormat("{0}, ", myKeywordArray[i]);
 
So now you are probably wondering how much faster is it to use a StringBuilder instance instead of a String. Guess what? You don't want to use a string builder all the time. In some instances, it will be faster to use a regular string. It all depends on what you are doing with the string. Here is some code I put together some code to do some timing tests:
int maxConcatenations = 15;
int timesToTry = 1000000;
Stopwatch sw;

for (int concat = 0; concat < maxConcatenations; concat++)
{
  Console.WriteLine("Testing for {0} concatenations....", 
    concat);
  sw = Stopwatch.StartNew();
  for (int i = 0; i < timesToTry; i++)
  {
    string test = "";
    for (int j = 0; j < concat; j++)
      test += "1";
  }

  sw.Stop();
  Console.WriteLine("For String:        {0} milliseconds", 
    sw.Elapsed.TotalMilliseconds);

  sw = Stopwatch.StartNew();
  for (int i = 0; i < timesToTry; i++)
  {
    StringBuilder foo = new StringBuilder();
    for (int j = 0; j < concat; j++)
      foo.Append("1");
  }
  sw.Stop();
  Console.WriteLine("For StringBuilder: {0} milliseconds", 
    sw.Elapsed.TotalMilliseconds);

  Console.WriteLine();
}
 
What this code does is test how long it takes to concatenate a string using both a standard String and a StringBuilder. It initially tries with no concatenations (and tries this a million times), and reports the total time for a String and for a StringBuilder. Then it tries this again with a single concatenation, and then with two, all the way up to 14. Lets take a look at the output:
Testing for 0 concatenations....
For String:        3.5055 milliseconds
For StringBuilder: 53.3664 milliseconds

Testing for 1 concatenations....
For String:        8.8345 milliseconds
For StringBuilder: 72.5983 milliseconds

Testing for 2 concatenations....
For String:        45.1594 milliseconds
For StringBuilder: 96.4409 milliseconds

Testing for 3 concatenations....
For String:        82.5788 milliseconds
For StringBuilder: 119.0281 milliseconds

Testing for 4 concatenations....
For String:        119.7253 milliseconds
For StringBuilder: 145.4168 milliseconds

Testing for 5 concatenations....
For String:        158.5999 milliseconds
For StringBuilder: 167.7985 milliseconds

Testing for 6 concatenations....
For String:        244.0745 milliseconds
For StringBuilder: 190.5011 milliseconds

Testing for 7 concatenations....
For String:        295.2766 milliseconds
For StringBuilder: 214.9981 milliseconds

Testing for 8 concatenations....
For String:        357.8609 milliseconds
For StringBuilder: 237.7211 milliseconds

Testing for 9 concatenations....
For String:        395.8825 milliseconds
For StringBuilder: 261.7619 milliseconds

Testing for 10 concatenations....
For String:        473.3714 milliseconds
For StringBuilder: 285.9038 milliseconds

Testing for 11 concatenations....
For String:        534.1772 milliseconds
For StringBuilder: 308.9129 milliseconds

Testing for 12 concatenations....
For String:        621.7295 milliseconds
For StringBuilder: 333.1688 milliseconds

Testing for 13 concatenations....
For String:        685.6693 milliseconds
For StringBuilder: 355.3806 milliseconds

Testing for 14 concatenations....
For String:        757.2249 milliseconds
For StringBuilder: 379.9797 milliseconds
 
So it looks like 6 concatenations is the break even point. At that point it now takes longer with a standard string than with a StringBuilder. Just a couple more later, and you start to see a 2x speedup using the StringBuilder. So thats a pretty decent rule of thumb - if you are doing more than 6 concatenation operations to a string, you should probably be using a StringBuilder. And these performance gains just continue to grow the more operations you perform. Here are the numbers for 200 concatenations:
Testing for 200 concatenations....
For String:        23780.9262 milliseconds
For StringBuilder: 5608.5013 milliseconds
 
Now its up to a 4x improvement!

Thats it for this introduction to the StringBuilder class. And even if you already knew about the class, I hope you found the performance numbers fun and interesting!

Comments

Popular posts from this blog

C# Snippet - Shuffling a Dictionary [Beginner]

Randomizing something can be a daunting task, especially with all the algorithms out there. However, sometimes you just need to shuffle things up, in a simple, yet effective manner. Today we are going to take a quick look at an easy and simple way to randomize a dictionary, which is most likely something that you may be using in a complex application. The tricky thing about ordering dictionaries is that...well they are not ordered to begin with. Typically they are a chaotic collection of key/value pairs. There is no first element or last element, just elements. This is why it is a little tricky to randomize them. Before we get started, we need to build a quick dictionary. For this tutorial, we will be doing an extremely simple string/int dictionary, but rest assured the steps we take can be used for any kind of dictionary you can come up with, no matter what object types you use. Dictionary < String , int > origin = new Dictionary < string , int >();

C# Snippet - The Many Uses Of The Using Keyword [Beginner]

What is the first thing that pops into your mind when you think of the using keyword for C#? Probably those lines that always appear at the top of C# code files - the lines that import types from other namespaces into your code. But while that is the most common use of the using keyword, it is not the only one. Today we are going to take a look at the different uses of the using keyword and what they are useful for. The Using Directive There are two main categories of use for the using keyword - as a "Using Directive" and as a "Using Statement". The lines at the top of a C# file are directives, but that is not the only place they can go. They can also go inside of a namespace block, but they have to be before any other elements declared in the namespace (i.e., you can't add a using statement after a class declaration). Namespace Importing This is by far the most common use of the keyword - it is rare that you see a C# file that does not h

C# WPF Printing Part 2 - Pagination [Intermediate]

About two weeks ago, we had a tutorial here at SOTC on the basics of printing in WPF . It covered the standard stuff, like popping the print dialog, and what you needed to do to print visuals (both created in XAML and on the fly). But really, that's barely scratching the surface - any decent printing system in pretty much any application needs to be able to do a lot more than that. So today, we are going to take one more baby step forward into the world of printing - we are going to take a look at pagination. The main class that we will need to do pagination is the DocumentPaginator . I mentioned this class very briefly in the previous tutorial, but only in the context of the printing methods on PrintDialog , PrintVisual (which we focused on last time) and PrintDocument (which we will be focusing on today). This PrintDocument function takes a DocumentPaginator to print - and this is why we need to create one. Unfortunately, making a DocumentPaginator is not as easy as