Dictionary Index Lookup Vs Contains Key Vs List Contains Vs Linq… Speed Test/Texas Tornado Match

Ok so with MVC comes the use of Routes which calls in the need to compare request values to see which route to use. Now before I even bother with that headache (Although it’s getting better AND will be a post) I ran into a situation where I would have to check a passed in string against a list of strings to see if it matches any of them.

One thing I like to do is use Dictionaries. They are just plain convenient when it comes to looking things up or matching values to get methods. But what if I don’t really have a value to find with a key? What if finding the key is all that matters? Say I have a list of strings and I just want to know if the list contains that string, sounds like a job for an array or list right? Wouldn’t it be silly to create a dictionary like:

  Dictiontary<String, String> someList = new Dictiontary<String, String>();
  someList.Add("INeedThis", ""); someList.Add("ThisToo", "");

and do this:

  if(someList.ContainsKey("INeedThis"))

If I don’t actually care about the attached value? I’m sure I’m breaking a rule somewhere… but what if it was faster overall? What if ContainsKey is faster than a list using Any, Contains, FirstOrDefault, or where? Turns out it is. Here’s the method I used.

  public void TimeRun(Holder toHold)
  {
    Int32 maxLength = 1000;

    Dictionary<String, String> stringDictionary = Enumerable.Range(0, maxLength).Select(item => RandomTool.RandomString(item)).ToDictionary(item => item, item => item);
    List<String> stringList = stringDictionary.Select(item => item.Key).ToList();

    String chosenString = stringList[RandomTool.RandomInt32(0, maxLength)];

    Stopwatch runTime = new Stopwatch();

    runTime.Start();
    stringDictionary.ContainsKey(chosenString);
    runTime.Stop();
    toHold.DictionaryContainsKeyTime = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    String junk = stringDictionary[chosenString];
    runTime.Stop();
    toHold.DictionaryStraightIndexCheck = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    Boolean junkThree = stringList.Contains(chosenString);
    runTime.Stop();
    toHold.ListContains = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    Boolean junkTwo = stringList.Any(item => item == chosenString);
    runTime.Stop();
    toHold.ListLinqAny = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    String junkFour = stringList.First(item => item == chosenString);
    runTime.Stop();
    toHold.ListLinqFirst = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    IEnumerable<String> junkFive = stringList.Where(item => item == chosenString);
    if (junkFive.FirstOrDefault() != String.Empty)
    {

    }
    runTime.Stop();
    toHold.ListLinqWhere = runTime.ElapsedTicks;
    runTime.Reset();
  }

Crazy simple, and why shouldn’t it? Am I right? Am I right? Ok. As you can see, I gave all the methods a run and timed them using StopWatch. And then I ran it a given amount of times, 200 in this code but I tried up to 10000 also. (I’ll put the test code at the end) The test was to go through a list of a thousand strings, each string increasing in length. (Yeah I could have done random size strings but I’m lazy)

What did I find out? Well if it didn’t throw an exception, a straight index search on a dictionary is fastest:

someList["INeedThis"]

And pretty consistently fast. Around 2600 ticks or so on average on multiple runs. (so 10 iterations of parent method running 200-10000 interations of the test method) Next fastest was the ContainsKey method on the dictionary, usually around 2-4 times faster than the next in line good old List.Contains. What I did find surprising is that all the Linq methods failed on this one. I figured that once the first run was through, it would be at least as fast as Contains. (Linq always sucks the first time through) Yeah not so much though. Contains was always faster. Sometimes it was close. Sometimes not even. Here are some example runs:

Dictionary_ContainsKey: 15805
Dictionary_StraightIndexCheck: 2926
List_Contains: 34559
List_LinqAny: 96575
List_LinqFirst: 56541
List_LinqWhere: 64678 

Dictionary_ContainsKey: 7264
Dictionary_StraightIndexCheck: 2676
List_Contains: 29970
List_LinqAny: 41280
List_LinqFirst: 58313
List_LinqWhere: 45669 

Dictionary_ContainsKey: 6773
Dictionary_StraightIndexCheck: 2636
List_Contains: 32366
List_LinqAny: 38670
List_LinqFirst: 33859
List_LinqWhere: 41288

All in ticks. Now mind you, none of these are horribly slow so it probably just comes down to reability and ease of understanding. Personally I like the Dictionary way, so at least speed wise I’m on track. As for looks? That’s a personal thing.

Rest of the code. Here is the parent method. This is a unit test hense the .Assert but it could easily be adapted to any output.

  [TestMethod]
  public void RunTime()
  {
    Int64 overallDictionaryContainsKeyTime = 0;
    Int64 overallDictionaryStraightIndexCheck = 0;
    Int64 overallListContains = 0;
    Int64 overallListLinqAny = 0;
    Int64 overallListLinqFirst = 0;
    Int64 overallListLinqWhere = 0;
    Int32 loopMax = 200;

    for (Int32 loopCounter = 0; loopCounter < loopMax; loopCounter++)
    {
      Holder currentHolder = new Holder();

      TimeRun(currentHolder);
      overallDictionaryContainsKeyTime += currentHolder.DictionaryContainsKeyTime;
      overallDictionaryStraightIndexCheck += currentHolder.DictionaryStraightIndexCheck;
      overallListContains += currentHolder.ListContains;
      overallListLinqAny += currentHolder.ListLinqAny;
      overallListLinqFirst += currentHolder.ListLinqFirst;
      overallListLinqWhere += currentHolder.ListLinqWhere;
    }

    Assert.IsTrue
    (
      false,
      " Dictionary_ContainsKey: " + (overallDictionaryContainsKeyTime / loopMax) +
      " Dictionary_StraightIndexCheck: " + (overallDictionaryStraightIndexCheck / loopMax) +
      " List_Contains: " + (overallListContains / loopMax) +
      " List_LinqAny: " + (overallListLinqAny / loopMax) +
      " List_LinqFirst: " + (overallListLinqFirst / loopMax) +
      " List_LinqWhere: " + (overallListLinqWhere / loopMax)
    );
  }

And the holder class which is a nothing class. I just didn’t care for having to add parameters to the child mehod.

  public class Holder
  {
    public Int64DictionaryContainsKeyTime { get; set; }
    public Int64DictionaryStraightIndexCheck { get; set; }
    public Int64ListLinqAny { get; set; }
    public Int64ListContains { get; set; }
    public Int64ListLinqFirst { get; set; }
    public Int64ListLinqWhere { get; set; }
  }

Couple Notes:

  • StopWatch is in System.Diagnostics
  • RandomTool is actual a class of mine. Nothing special about it. Just makes a string of X length with all random letters.
  • This can not be rebroadcast or retransmitted without the express written permission of my mom.