C# Fast Collections : Performance Tips


If we want to store a collection of values there are many possibilities in the .net framework.
we have

  • collection classes in System.collections
  • Generic collections in system.collections.generic
  • Regular Typed arrays

so which approach is the fastest… lets find out

Below is the standard setup I have used to measure the performance


const int IterateMilliontimes = 1000000;
public static long MeasureArrayList() {
ArrayList list = new ArrayList();
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0; i < IterateMilliontimes; i++) {
list.Add(i);
}
stopwatch.Stop();
return stopwatch.ElapsedMilliseconds;
}

public static long MeasureGenericList() {
List < int > list = new List < int > ();
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0; i < IterateMilliontimes; i++) {
list.Add(i);
}
stopwatch.Stop();
return stopwatch.ElapsedMilliseconds;
}
public static long MeasureNativeIntegerArray() {
int[] list = new int[numElements];
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0; i < IterateMilliontimes; i++) {
list[i] = i;
}
stopwatch.Stop();
return stopwatch.ElapsedMilliseconds;
}

and then of course the main()

long ArrayListDuration = MeasureArrayList();
long GenericListDuration = MeasureGenericList();
long NatIntArrayDuration = MeasureNativeIntegerArray();
Console.WriteLine("ArrayList: {0}", ArrayListDuration);
Console.WriteLine("List<int>: {0}", GenericListDuration);
Console.WriteLine("int[]: {0}", NatIntArrayDuration);

so now If I were to measure the performance for arraylist vs genericlist the results are

ArrayList:  254 ms
GenericList: 42 ms

arraylist

The image above depicts the memory layout for Arraylist after 3 iterations. This is because many collections in the system.collection namespace (Arraylist) uses object arrays for internal storage. So each element in the Arraylist is referenced to a boxed integer which is else where relived. Boxing and unboxing introduces  a significant performance overload to your code.

Compare this to the memory layout of a generic list

genericlist

The list is on the heap, but notice that  the integers are also stored directly inside the list array, and this is because a generic list of type integer uses native integer array for internal storage and not an object array.  This eliminates all the object references from the list elements and also removes the need for boxing.

So the main take away is

  • Always use the generic collection classes from the system.collections.generic namespace for mission critical code.
  • Avoid the classes in the system.collection namespace in mission critical code

 

But wait we are not done here….If we take a look at the code that’s highlighted in Orange… which is essentially  a native integer array pre-initialized to 1 million elements and assigning values one at a time.

Lets  compare the performance with the other two we ran earlier

Arraylist : 257 ms
List<int> : 52 ms
int[] : 11 ms

The native array only took only 11 ms to run!!!  The reason for this excellent performance is because the intermediate language that the c# compiler compiles to has native support for arrays.

I am sure at least a few of you must have a nagging question regarding the performance results, after all the array list of the generic list classes uses an internal array that has to continuously resize the array as it overflows its bounds, this compared to a pre-initialized array of native integers..

So lets run the test by pre-initializing all the arrays  and the results are……

wait for it…… !!!!

Arraylist : 187 ms
List<int> : 31 ms (40% faster than the previous run)
int[] : 12 ms

The built in support for arrays in the intermediate language will always ensure that they will outperform collection classes. The generic list and arrays simply cannot  compete with the raw performance.

performance
Collection Performance Results

The image above gives a representation of the collection performance results. The array list is the most expensive due to the boxing and unboxing of integer data. Presizing the array yields better results but it is not comparable to the array of native integers.

The generic list is a lot faster due to the fact that it can store the integers directly in the list element in the heap without needing an extra boxed reference. Pre-sizing yeilds even better results

So when should you use arrays?

  • If you have mission critical code, and the number of elements is known in advance, use an array
  • If you have mission critical code, and the number of elements is not known in advance, use a generic list
  • Avoid the classes in system.collections. They use boxing and are superseded by the generic classes in system.collections.generic

 

6 thoughts on “C# Fast Collections : Performance Tips

Add yours

  1. I’m crazy jealous of your nerdspeak. (Btw that’s a total compliment and I’m genuinely seriously jealous.) This post made me think of the penultimate scene in the last episode / first season of Silicon Valley. I won’t go into details but it cracked me up to think about it. Very clever post. Again I’m super green with envy but in a total hats off to you kind of way. 😉

    Liked by 2 people

Leave a comment

Website Powered by WordPress.com.

Up ↑