Everyone loves strings, you use them all the time in writing code
(especially if you are writing any code that needs to interact with the
user). And C# makes it really easy to work with strings - the useful
operators just work (like '+' and '+=') and every object has the
extremely useful
ToString
function call. In fact, strings are so easy
to work with in C#, one might almost say that they are too easy to work
with.
Why would someone say that? Well, because of some of the underlying
performance characteristics of the
String
class. Did you know that
every time you manipulate a string in C#, you get a new string object
back? This isn't that bad when dealing with short strings, or a short
number of operations per string, but when you start concatenating a
bunch of strings together it can get out of hand quickly. For instance,
take a look at the following code:string[] myKeywordArray = new string[100];
//
//myKeywordArray get populated
//
string userOutput = "Keywords: ";
for (int i = 0; i < myKeywordArray.Length - 1; i++)
userOutput += myKeywordArray[i] + ", ";
userOutput += myKeywordArray[myKeywordArray.Length - 1];
//
//userOutput is given to user
//
There are a large number of string concatenations going on here (about
200) - and every time that a concatenation occurs, a new string is
allocated and the data is copied. While a loop like this won't kill a
program (in fact, a loop like this will probably complete in no time at
all), you can see how easy it is to start racking up the string
allocations.
But never fear!
StringBuilder
is here! The
System.Text.StringBuilder
class is exactly what you want to use in this type of situation. When
you have a string that you are going to be doing a bunch of
concatenation operations on (or other operations like insert, replace or
remove), you want to use a StringBuilder
instance. The StringBuilder
class is optimized for these sorts of operations - it does not do any
sort of the allocation and copying of data that the standard String
class does. What it does is keep a buffer, and when that buffer is used
up (because of data being appended to the string), it allocates some
more buffer space.
Lets take a look at what the above code would look like using a
StringBuilder
:string[] myKeywordArray = new string[100];
//
//myKeywordArray get populated
//
StringBuilder outputBuilder = new StringBuilder();
for (int i = 0; i < myKeywordArray.Length - 1; i++)
{
outputBuilder.Append(myKeywordArray[i]);
outputBuilder.Append(", ");
}
outputBuilder.Append(myKeywordArray[myKeywordArray.Length - 1]);
string userOutput = outputBuilder.ToString();
//
//userOutput is given to user
//
Nothing real surprising here. Instead of concatenating strings together,
we are calling
Append
on the StringBuilding
. Once everything is
done, we tell the outputBuilder
to hand us a string by calling
ToString
. The StringBuilder
class even has an AppendFormat
call,
which acts exactly like the String.Format
call. So you could
potentially replace these two lines:outputBuilder.Append(myKeywordArray[i]);
outputBuilder.Append(", ");
With a single line like this:
outputBuilder.AppendFormat("{0}, ", myKeywordArray[i]);
So now you are probably wondering how much faster is it to use a
StringBuilder
instance instead of a String
. Guess what? You don't
want to use a string builder all the time. In some instances, it will be
faster to use a regular string. It all depends on what you are doing
with the string. Here is some code I put together some code to do some
timing tests:int maxConcatenations = 15;
int timesToTry = 1000000;
Stopwatch sw;
for (int concat = 0; concat < maxConcatenations; concat++)
{
Console.WriteLine("Testing for {0} concatenations....",
concat);
sw = Stopwatch.StartNew();
for (int i = 0; i < timesToTry; i++)
{
string test = "";
for (int j = 0; j < concat; j++)
test += "1";
}
sw.Stop();
Console.WriteLine("For String: {0} milliseconds",
sw.Elapsed.TotalMilliseconds);
sw = Stopwatch.StartNew();
for (int i = 0; i < timesToTry; i++)
{
StringBuilder foo = new StringBuilder();
for (int j = 0; j < concat; j++)
foo.Append("1");
}
sw.Stop();
Console.WriteLine("For StringBuilder: {0} milliseconds",
sw.Elapsed.TotalMilliseconds);
Console.WriteLine();
}
What this code does is test how long it takes to concatenate a string
using both a standard
String
and a StringBuilder
. It initially tries
with no concatenations (and tries this a million times), and reports the
total time for a String
and for a StringBuilder
. Then it tries this
again with a single concatenation, and then with two, all the way up to
14. Lets take a look at the output:Testing for 0 concatenations....
For String: 3.5055 milliseconds
For StringBuilder: 53.3664 milliseconds
Testing for 1 concatenations....
For String: 8.8345 milliseconds
For StringBuilder: 72.5983 milliseconds
Testing for 2 concatenations....
For String: 45.1594 milliseconds
For StringBuilder: 96.4409 milliseconds
Testing for 3 concatenations....
For String: 82.5788 milliseconds
For StringBuilder: 119.0281 milliseconds
Testing for 4 concatenations....
For String: 119.7253 milliseconds
For StringBuilder: 145.4168 milliseconds
Testing for 5 concatenations....
For String: 158.5999 milliseconds
For StringBuilder: 167.7985 milliseconds
Testing for 6 concatenations....
For String: 244.0745 milliseconds
For StringBuilder: 190.5011 milliseconds
Testing for 7 concatenations....
For String: 295.2766 milliseconds
For StringBuilder: 214.9981 milliseconds
Testing for 8 concatenations....
For String: 357.8609 milliseconds
For StringBuilder: 237.7211 milliseconds
Testing for 9 concatenations....
For String: 395.8825 milliseconds
For StringBuilder: 261.7619 milliseconds
Testing for 10 concatenations....
For String: 473.3714 milliseconds
For StringBuilder: 285.9038 milliseconds
Testing for 11 concatenations....
For String: 534.1772 milliseconds
For StringBuilder: 308.9129 milliseconds
Testing for 12 concatenations....
For String: 621.7295 milliseconds
For StringBuilder: 333.1688 milliseconds
Testing for 13 concatenations....
For String: 685.6693 milliseconds
For StringBuilder: 355.3806 milliseconds
Testing for 14 concatenations....
For String: 757.2249 milliseconds
For StringBuilder: 379.9797 milliseconds
So it looks like 6 concatenations is the break even point. At that point
it now takes longer with a standard string than with a
StringBuilder
.
Just a couple more later, and you start to see a 2x speedup using the
StringBuilder
. So thats a pretty decent rule of thumb - if you are
doing more than 6 concatenation operations to a string, you should
probably be using a StringBuilder
. And these performance gains just
continue to grow the more operations you perform. Here are the numbers
for 200 concatenations:Testing for 200 concatenations....
For String: 23780.9262 milliseconds
For StringBuilder: 5608.5013 milliseconds
Now its up to a 4x improvement!
Thats it for this introduction to the
StringBuilder
class. And even if
you already knew about the class, I hope you found the performance
numbers fun and interesting!
Comments
Post a Comment