Review of VB.NET Data Storage Structures
Part eight of an eight-part series of blogs

There are a bewildering array (excuse the pun) of data storage structures available to you in Visual Basic. Choose from arrays, ArrayLists, SortedLists, Dictionaries, HashTables, Lists and DataTables, among others. This blog gives an example of each type of structure, and benchmarks them to show which perform best and worst.

  1. VB.NET Data Storage Types Compared and Benchmarked
  2. VB.NET Benchmarking Test of speeds of data structures
  3. Arrays - Visual Basic data structures
  4. ArrayLists and SortedLists - Visual Basic data structures
  5. Dictionaries and HashTables - Visual Basic data structures
  6. Lists - Visual Basic data structures
  7. Using data tables - Visual Basic data structures
  8. Benchmark Results and Recommendations (this blog)

Posted by Andy Brown on 24 August 2011

You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

Benchmark Results and Recommendations

Let's start with recording the test results for 10, 100, 1.000, 10,000 and 100,000 test values.

For anything more than 100,000 test values, you probably shouldn't be storing the data in memory anyway (and if you are doing, you can probably pick up on the trends from the results below and draw your own conclusions).

Results with 10 Values

Here are the results of running the tests for 10 items:

Results for 10 values

The results of running the tests for 10 values

Results with 100 Values

Here are the results of running the tests for 100 items:

Results for 10 values

The results of running the tests for 100 values

Results with 1,000 Values

Here are the results of running the tests for 1,000 items:

Results for 10 values

The results of running the tests for 1,000 values

Results with 10,000 Values

Here are the results of running the tests for 10,000 items:

Results for 10 values

The results of running the tests for 10,000 values

Results with 100,000 Values

Here are the results of running the tests for 100,000 items:

Results for 10 values

The results of running the tests for 100,000 values

My Conclusions

From the data above, 3 obvious conclusions stand out to me:

Conclusion Details
Don't use dynamic arrays for large data sets Dynamic arrays perform fine for small sets of data, but for anything more than about 1,000 items the cost of writing to the arrays becomes high (I suspect this is because whenever you expand the array by 1 element, internally the code creates a new, slightly larger array and copies the old array elements into the new one).
Data tables are slow The test is slightly unfair to data tables, but nevertheless they are clearly the slowest way to write, read and sort name/value pairs.
Otherwise, it really doesn't matter! I was surprised by this result: provided that you avoid large dynamic arrays and data tables, it really doesn't matter which data structure you use much! 

Recommendations

Given the above, I'm going to be using:

  • hash tables or dictionaries when I want to store data, and retrieve it by name only; and
  • lists otherwise

If you've read this far, I'm sorry the conclusions weren't more dramatic!

Or maybe I'll just carry on using what I do know - the speed differences above may look big, but they're measured in ticks.  With 10,000 ticks in a millisecond, and 1,000 milliseconds in a second, the difference between most times is under a tenth of a second.

This blog has 0 threads Add post