Create An X Delimited String From A Char List Using Linq Aggregate

A quick example of how to use the Aggregate method to create a string of delimited members, or in this case characters. You might wonder why this example, or at least you should. It’s true, the character list to delimited string is pretty useless, but some idiot from where I work needed it.

[TestMethod]
public void GetStringFromCharacters()
{
    var charList = Enumerable.Range(0, 10).Select(x => 'a').ToList();

    charList
        .Aggregate("", (inner, outer) => inner + outer.ToString() + ",")
        .Should()
        .Be("a,a,a,a,a,a,a,a,a,a");
}

The big thing here is the “” in the Aggregate method signature. That basically says that Inner is a string. If I didn’t have that specified, both inner and outer would be a character.

YAY

Linq and Stack… Take versus Pop

So this might be filed under “Who f—ing cares” but I thought it was somewhat interesting. If you’ve ever used a Stack, you should be familiar with Pop and Peek. If not, here’s a little diddy from a guy named diddy. Actually that’s a lie. I have no affiliation Sean “Puffy” “Puff Daddy” “P Diddy” “Whatever he’s called now” Combs. We do share the same first name though. (Annnnnd wait for incoming lawsuit over using his name on this site)

A stack is a first in last in first out structure that in the .Net world uses two methods to get values back from it: Pop and Peek. Pop will give you the item AND remove it from the stack. Peek will merely give you the item but leave it safely on the stack.

What’s the point of this post? I’ll tell you when I find out.

Now when using Linq with a stack, you might get in trouble if you assume the Take method uses pop to get the value:

  return stackToUse.Take(count).ToList();

You would think that this would use Pop since Pop really is the “natural” (For lack of a better word) function of a stack. Most languages can guarantee a Push and Pop for stacks, but not all languages have a Peek. So it would be normal to assume the default is Pop. Problem is: It’s not. The Take method actually uses the Peek method. So these two methods will give a completely different return:

    ///Uses Pop
    ///  Return list with have "count" number of items and stackToUse will have the original
    ///    count of items minus "count"
    public static List<Object> CreateListFromPopOnStack(Stack<Object> stackToUse, Int32 count)
    {
      return Enumerable.Range(0, count).Select(item => stackToUse.Pop()).ToList();
    }

    ///Uses Take/Peek
    ///  Return list will have "count" number of items and stackToUse will have still have
    ///     the same number of items it came in with.
    public static List<Object> CreateListFromTakeOnStack(Stack<Object> stackToUse, Int32 count)
    {
      return stackToUse.Take(count).ToList();
    }

In the end, this is a rare case you will actually need to know, but what the hell? Why not know it?

Dictionary Index Lookup Vs Contains Key Vs List Contains Vs Linq… Speed Test/Texas Tornado Match

Ok so with MVC comes the use of Routes which calls in the need to compare request values to see which route to use. Now before I even bother with that headache (Although it’s getting better AND will be a post) I ran into a situation where I would have to check a passed in string against a list of strings to see if it matches any of them.

One thing I like to do is use Dictionaries. They are just plain convenient when it comes to looking things up or matching values to get methods. But what if I don’t really have a value to find with a key? What if finding the key is all that matters? Say I have a list of strings and I just want to know if the list contains that string, sounds like a job for an array or list right? Wouldn’t it be silly to create a dictionary like:

  Dictiontary<String, String> someList = new Dictiontary<String, String>();
  someList.Add("INeedThis", ""); someList.Add("ThisToo", "");

and do this:

  if(someList.ContainsKey("INeedThis"))

If I don’t actually care about the attached value? I’m sure I’m breaking a rule somewhere… but what if it was faster overall? What if ContainsKey is faster than a list using Any, Contains, FirstOrDefault, or where? Turns out it is. Here’s the method I used.

  public void TimeRun(Holder toHold)
  {
    Int32 maxLength = 1000;

    Dictionary<String, String> stringDictionary = Enumerable.Range(0, maxLength).Select(item => RandomTool.RandomString(item)).ToDictionary(item => item, item => item);
    List<String> stringList = stringDictionary.Select(item => item.Key).ToList();

    String chosenString = stringList[RandomTool.RandomInt32(0, maxLength)];

    Stopwatch runTime = new Stopwatch();

    runTime.Start();
    stringDictionary.ContainsKey(chosenString);
    runTime.Stop();
    toHold.DictionaryContainsKeyTime = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    String junk = stringDictionary[chosenString];
    runTime.Stop();
    toHold.DictionaryStraightIndexCheck = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    Boolean junkThree = stringList.Contains(chosenString);
    runTime.Stop();
    toHold.ListContains = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    Boolean junkTwo = stringList.Any(item => item == chosenString);
    runTime.Stop();
    toHold.ListLinqAny = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    String junkFour = stringList.First(item => item == chosenString);
    runTime.Stop();
    toHold.ListLinqFirst = runTime.ElapsedTicks;
    runTime.Reset();

    runTime.Start();
    IEnumerable<String> junkFive = stringList.Where(item => item == chosenString);
    if (junkFive.FirstOrDefault() != String.Empty)
    {

    }
    runTime.Stop();
    toHold.ListLinqWhere = runTime.ElapsedTicks;
    runTime.Reset();
  }

Crazy simple, and why shouldn’t it? Am I right? Am I right? Ok. As you can see, I gave all the methods a run and timed them using StopWatch. And then I ran it a given amount of times, 200 in this code but I tried up to 10000 also. (I’ll put the test code at the end) The test was to go through a list of a thousand strings, each string increasing in length. (Yeah I could have done random size strings but I’m lazy)

What did I find out? Well if it didn’t throw an exception, a straight index search on a dictionary is fastest:

someList["INeedThis"]

And pretty consistently fast. Around 2600 ticks or so on average on multiple runs. (so 10 iterations of parent method running 200-10000 interations of the test method) Next fastest was the ContainsKey method on the dictionary, usually around 2-4 times faster than the next in line good old List.Contains. What I did find surprising is that all the Linq methods failed on this one. I figured that once the first run was through, it would be at least as fast as Contains. (Linq always sucks the first time through) Yeah not so much though. Contains was always faster. Sometimes it was close. Sometimes not even. Here are some example runs:

Dictionary_ContainsKey: 15805
Dictionary_StraightIndexCheck: 2926
List_Contains: 34559
List_LinqAny: 96575
List_LinqFirst: 56541
List_LinqWhere: 64678 

Dictionary_ContainsKey: 7264
Dictionary_StraightIndexCheck: 2676
List_Contains: 29970
List_LinqAny: 41280
List_LinqFirst: 58313
List_LinqWhere: 45669 

Dictionary_ContainsKey: 6773
Dictionary_StraightIndexCheck: 2636
List_Contains: 32366
List_LinqAny: 38670
List_LinqFirst: 33859
List_LinqWhere: 41288

All in ticks. Now mind you, none of these are horribly slow so it probably just comes down to reability and ease of understanding. Personally I like the Dictionary way, so at least speed wise I’m on track. As for looks? That’s a personal thing.

Rest of the code. Here is the parent method. This is a unit test hense the .Assert but it could easily be adapted to any output.

  [TestMethod]
  public void RunTime()
  {
    Int64 overallDictionaryContainsKeyTime = 0;
    Int64 overallDictionaryStraightIndexCheck = 0;
    Int64 overallListContains = 0;
    Int64 overallListLinqAny = 0;
    Int64 overallListLinqFirst = 0;
    Int64 overallListLinqWhere = 0;
    Int32 loopMax = 200;

    for (Int32 loopCounter = 0; loopCounter < loopMax; loopCounter++)
    {
      Holder currentHolder = new Holder();

      TimeRun(currentHolder);
      overallDictionaryContainsKeyTime += currentHolder.DictionaryContainsKeyTime;
      overallDictionaryStraightIndexCheck += currentHolder.DictionaryStraightIndexCheck;
      overallListContains += currentHolder.ListContains;
      overallListLinqAny += currentHolder.ListLinqAny;
      overallListLinqFirst += currentHolder.ListLinqFirst;
      overallListLinqWhere += currentHolder.ListLinqWhere;
    }

    Assert.IsTrue
    (
      false,
      " Dictionary_ContainsKey: " + (overallDictionaryContainsKeyTime / loopMax) +
      " Dictionary_StraightIndexCheck: " + (overallDictionaryStraightIndexCheck / loopMax) +
      " List_Contains: " + (overallListContains / loopMax) +
      " List_LinqAny: " + (overallListLinqAny / loopMax) +
      " List_LinqFirst: " + (overallListLinqFirst / loopMax) +
      " List_LinqWhere: " + (overallListLinqWhere / loopMax)
    );
  }

And the holder class which is a nothing class. I just didn’t care for having to add parameters to the child mehod.

  public class Holder
  {
    public Int64DictionaryContainsKeyTime { get; set; }
    public Int64DictionaryStraightIndexCheck { get; set; }
    public Int64ListLinqAny { get; set; }
    public Int64ListContains { get; set; }
    public Int64ListLinqFirst { get; set; }
    public Int64ListLinqWhere { get; set; }
  }

Couple Notes:

  • StopWatch is in System.Diagnostics
  • RandomTool is actual a class of mine. Nothing special about it. Just makes a string of X length with all random letters.
  • This can not be rebroadcast or retransmitted without the express written permission of my mom.

Paging and the Entity Framework, Skip, and Take Part 3

Get the total count of pages. | Get the real page number. | Using Skip and Take to Page | The Actual Paging Controls

Ok so the last two posts have been arguably useless, maybe more so than anything else here, but they were somewhat needed because now I am going to show how to Linq, the Entity Framework, and well that’s it I think.

public static IList<ToolItem> GetSomeTools(String name, Int32 numberToShow, Int32 pageNumber, out Int32 realPage, out Int32 totalCountOfPages)
{
  //EntityContext.Context is just a singletonish version of the
  //Entities class.  Most people would use
  //  using (ToolEntities context = new ToolEnties())
  Int32 totalCount = EntityContext.Context.ToolItems
		   .Count(item => item.Name == name);
  //This is the method from the first post of this series
  //Just getting the count of pages based on numberToShow and
  //item totalCount
  totalCountOfPages = TotalCountOfPages(totalCount, numberToShow);
  //This is the method from the second post of this series
  //Basically getting the best possible page if the page number
  //is higher than the totalCountOfPages or lower than 0
  realPage = GetRealPage(totalCountOfPages, pageNumber);

  returnValue = EntityContext.Context.ChatRooms
			  .Where(item => item.Name == name )
			  .OrderBy(item => item.Name)
			  .Skip(numberToShow * realPage)
			  .Take(numberToShow)
			  .ToList();

  return returnValue.ToList();
}

Really simple yes? It follows like this:

Say I’m on page 1, which for this would be zero or pageNumber – 1. So I want to grab the first 20 items from the database. Well that means I want to start at 0 and grab 20. Now if you want this all to be done with some kind of conditional thing that either handles the first page or the other pages, you actually want to skip the same way no matter what the page number is. This is taken care of by numberToShow * realPage since even at 0 this works. After all 0 * anything is 0 and therefore you will be Skipping 0 items. So in other words, you’re at the start. Next you want to Take the amount of items you need, which is 20. Next time around you’ll start at 20 Skip(numberToShow 20 * realPage 1) and take the next 20. Nice thing is, even if you say Take 20 and there are only 2 left, it doesn’t care. It will just grab those two.

And there you have it, how to page with the Entity Framework and minimal amount of work. I know I hate taking other people’s methods (Like the TotalCountOfPages and GetRealPage methods), don’t know why. So sorry if I am forcing you to do so. However, the two methods I gave are semi important to this.

You might wonder why realPage and totalCountOfPages, well this is useful stuff when knowing what page is next for paging controls. Next post I’ll show those off but I’ll warn you, they are nothing spectacular.

Linq Join Extension Method and How to Use It…

I don’t like using the query syntax when it comes to Linq to AnythingButTheKitchenSink . Not sure why. Mostly, I guess, is that I seem to have a liking for Funcs and Actions to the point of stupidity and although you can work them into the query syntax, it just doesn’t look right.

Now with most of the Linq methods like Where or First, it’s simple once you understand lamdba expressions:

.SomeMethod(someField => someField.Property == value);

Now what about join?

JOINHELL

So inner selector with an outer selector and a selector selects a selecting selector. Right got it.

Well let’s try to break it down. First part is

this IEnumerable<TOuter>

So being that this is an extension method meaning this is the collection you are using this method on.

IEnumerable<TInner> inner

So second field must be the list you want to join to. Ok so far.

Func<TOuter, TKey> outerKeySelector

Now this is where it gets a little odd looking. We know we have Outer and Inner lists so there needs to be a way to join on something. Say Outer is User and Inner is UserAddress. Most likely you will have a UserID on both lists. If not, you do now. So basically what this part of the method is saying is “Give me the stupid key on the Outer (User) list that I should care about.”

, user => user.UserID,

Next part:

Func<TInner, TKey> innerKeySelector

Pretty much the same thing, except now it needs the key from the Innerlist (UserAddress):

, address => address.UserID,

Now for the fun part:

Func<TOuter, TInner, TResult> resultSelector

Sa…say what? Ok this may look weird at first but you’ll hate yourself for not seeing it. It’s just asking you what to select from the two lists as some kind of hybrid object. See, you have to remember that with these linq methods, each method will produce a list. You can’t just chain them together and have it remember every list you’ve made:

   user.Where(user => user.UserID > 1) // gives me a list of users
         .Select(user => new { user.UserName, user.UserAddress, user.UserID } 
         //Gives me new items with user name, address, and user id

From this simple method chain, the end list is NOT the same as the one you started with or the one produced by the where method.

The last part of the Join method needs you to tell it what it’s going to produce from this join. Now it probably could just guess and include both lists, but that could be seen as sloppy and ultimately this gives you the choice of what exactly needs to be taken after the join. So:

, (user, address) => new { user, address});

So in this case, the newly created and joined list with be a list of items that have a user and address attached to it much like if you had a list of:

class UserAddressHybrid()
{
    public User user { get; set; }
    public UserAddress userAddress { get; set; }
}

So in other words, WHAT DO YOU WANT YOUR RESULTS TO LOOK LIKE?

In full it would look something like:

user.Join(address => address.User.UserID,  //IEnumerable<TInner> inner
             user => user.UserID,  //Func<TOuter, TKey> outerKeySelector
             address => address.UserID,  //Func<TInner, TKey> innerKeySelector
             (user, address) => new { user, address});  //Func<TOuter, TInner, TResult> resultSelector

Not so hard anymore, is it? You can start kicking yourself now.

Use Linq to Split a List: Skip and Take

Say what? Ok this is simple, and probably useless for most people but I thought I’d post it anyhow. Basically, say you have a huge list of something and you need to split it into smaller lists of something. This might be the case if you want to use parameterized SQL or something like HQL to send in a list full of somethings. Problem? Sql Server will only allow so many parameters to be sent in. Now you could send in a string in some cases, but meh. Kind of sloppy. So what do you do? You come here and you gank this method.

        public static IList<IList<T>> SplitList<T>
          (IList<T> listToSplit, Int32 countToTake)
        {
            IList<IList<T>> splitList = new List<IList<T>>();
            Int32 countToSkip = 0;

            do
            {
                splitList.Add(listToSplit.Skip(countToSkip)
                 .Take(countToTake).ToList());
                countToSkip += countToTake;
            } while (countToSkip < listToSplit.Count);

            return splitList;
        }

Pretty simple. It takes in a list of whatever and gives you back a list of lists of whatever. The fun part is using Skip and Take. Two methods I have come to love.

Basically you start out skipping nothing and taking a set amount… say 2000. Next time through, you start by skipping 2000 and taking the next 2000. Beauty of Take is it won’t just die on you if you don’t have enough items. It’ll just grab what’s left. Yay for take.

Linq Extension Methods Versus Linq Query Language… DEATHMATCH

Today I was writing out an example of why the extension methods are for the most part better to use than the querying language. Go figure I would find a case where that’s not entirely true. Say you are using these three funcs:

    Func<User, String> userName = user => user.UserName;
    Func<User, Boolean> userIDOverTen = user => user.UserID < 10;
    Func<User, Boolean> userIDUnderTen = user => user.UserID > 10;

As you can see the first one replaces the lamdba expression to get the user name, the second replaces a lamdba expression used to check if the ID is lower than 10, and let’s face it, the third should be pretty easy to understand now.

NOTE: This is a silly example but it works.

    var userList =
      from user in userList
      where userIDOverTen(user)
      select userName;
Versus

    var otherList =
      userList
      .Where(IDIsBelowNumber)
      .Select(userName)

In this example, the second is a little less verbose since the extension method can make full use of the Func, but he Linq expression can’t since it is look just for a Boolean rather than a Func that returns boolean. However, this is where it might be better to use the expression language. Say you already had a method that takes in more than just a user:

    private Boolean IDIsBelowNumber(User user, Int32 someNumber, Boolean doSomething)
    {
      return user.UserID < someNumber;
    }

Note: doSomething is just there because of the where extension method being ok with a method that takes in a user and integer and returns boolean. Kind of annoying for this example.

Now if you look at the Linq query:

    var completeList =
      from user in userList
      where userIDOverTen(user, 10)
      select userName;

You’re good for it. Now the Extension Method:

    var otherList =
      userList
      .Where(IDIsBelowNumber????)
      .Select(userName)

Without a lambda expression, I really can’t call that method. So now what I have to do is create a method that creates a Func based off the original method call.

    private Func<User, Boolean> IDIsBelowNumberFunc(Int32 number)
    {
      return user => IDIsBelowNumber(user, number, true);
    }

And then plug it in:

    var otherList =
      userList
      .Where(IDIsBelowNumberFunc(10))
      .Select(userName)

What does this all mean? You just lost 5 minutes of your life. I hope it was worth it.

Cannot Resolve Method, Can’t Infer Return Type, and Funcs

So ran into this today and the answer was actually a lot easier to understand than I thought it would be.

Say you want to order a list of objects by a number. Seems simple. Now if you have been paying attention you would know I like using Funcs.

  Func<SomeClass, Int32> orderByNumber =
    currentClass =>  currentClass.SomeNumber;

  anotherCollection = someCollection.OrderBy(orderByNumber);

Seems simple, but what if you wanted to use a method already defined in the class?

  private Int32 ReturnNumber(SomeClass currentClass)
  {
    return currentClass.SomeNumber;
  }

It seems like this could be the way to go, right?

  someCollection.OrderBy(ReturnNumber);

Compile and BOOOOOOM you get an error. It says it can’t infer the return type of the method. Wait what? It’s pretty obvious right, it’s an integer. It had no problem inferring from the Func and you have to figure that the method itself is “typed” also. Well here’s the problem (And you’re dumb for not knowing this, but I’m not because I’m immune to dumb), ReturnNumber isn’t a method, it’s part of a method group. You can have a million (well maybe not that many) methods named ReturnNumber, all with different parameters. Why is this a problem? Well let’s use lambda expressions:

  someCollection.OrderBy(currentClass =>  currentClass.SomeNumber);

At this point it knows two things: currentClass is a SomeClass and there is a Method that takes in a SomeClass and returns something. So with that in mind, it looks for such a method and finds the return type. This is no different with the Func since the Func is basically unique due to it being a named field. After all you can’t have two fields named orderByNumber, but you can have many methods named ReturnNumber. That is where the problem is. When you use the second example:

  someCollection.OrderBy(ReturnNumber);

It can infer the SomeClass from the list and it sees the method. For there it has to find the method’s return type. Wait, which method? If i Have 10 overloads, each with different return types, how does it know what type to use? Well the answers is, it doesn’t. So basically you’re screwed. Sucks, huh?

Side note: This works

  Func<SomeClass, Int32> orderByNumber = ReturnNumber;

Uhg It Won’t End

Still on the readability thing, but there was a second argument in the post that inspired now what is three posts of my own here. The question was should you use Linq based on people saying it’s more readable, therefore just making it syntax sugar.

  foreach(Item current in itemList)
  {
     itemNameList.Add(current.Name);
  }

Versus

 var itemNameList = from item in itemList
                    select item.Name;

Or

  Func<Item, String> itemName = current => current.Name;
  itemNameList.Select(itemName);

So at this point it’s really a matter of preference. Problem is, you have to look closer to why the third is so much more than syntax yummies.

Say you want a method that takes in a UserList and you want to select all the users that have a property (Could be name, address, whatever) that matches a string. Well you could do this:

 public IList<User> AllUsersThatMatch(IList<User> userList, NeededProperty property, String value)
 {
    IList<User> returnList;

    returnList = new List();

    foreach(UserItem currentUser in userList)
    {
        switch(property)
        {
            case(NeededProperty.Name):
                if(currentUser.Name == value)
                {
                    userList.Add(currentUser);
                }
                break;
            case(NeededProperty.Phone):
                if(currentUser.Phone == value)
                {
                    userList.Add(currentUser);
                }
                break;
        }
    }
 }

Or you could do this:

 public Func<User, Boolean> MatchesProperty(NeededProperty property, String value)
 {
    Func<User, Boolean> returnValue;

    switch(property)
    {
        case NeededProperty.Name:
            returnValue = currentItem => currentItem.Name == value;
            break;
        case NeededProperty.Phone:
            returnValue = currentItem => currentItem.Phone == value;
            break;
    }
    return returnValue;
  }

 public IList<User> AllUsersThatMatch(IList<User> userList, NeededProperty property, String value)
 {
    IList<User>  returnList;

    returnList = userList.Where(MatchesProperty(property, value));
    return returnList;
 }

Now which do you think is easier to upkeep? For those of you wondering what I did, I simply used a method that would return the Func I needed for the passed in Enum and called it in the Where clause. The amount of code is probably close to the same right now, but add in 5 more values for the NeededProperty enum and you’ll see the code amount differing more and more.

I realize this isn’t the best of example, and probably the first way could be refactored but the idea is still there. The Linq Method approach gives you a lot more flexibility in the long run with dynamic stuff like this.

What Is Readable Addon

Quick thought too about which to use due to readability:

var you = from factor in seansAwesomeness
          select new FactorLite
          {
             Amount = amount;
          };

or you could do:

Func<Person, FactorLite> selectFactorLite = currentFactor => new FactorLite { Amount = currentFactor.Amount };

seansAwesomeness.Select(selectFactorLite);

I guess it’s a matter of preference, but the first seems way too verbose for something too simple.