C# Linq Except trap!

A few days ago I resolved simple kata on Codewars – „Disemvowel Trolls”

This particular kata is of type ‚remove vowels from the string’ – easy peasy. One of the many approaches to this kind of problem is:

  • Regex
  • String replacing
  • Loop with check
  • Lambda expression (in this case Linq)

I’ve decided to give a shot Linq approach this time and what first came to mind? Use Except! Treat this string as a collection of chars and remove from it the second set of chars.

#1 solution

public static string Disemvowel(string str)
{
    var vowels = new[] {'a', 'e', 'i', 'o', 'u' };
    return new string(str.Except(vowels).ToArray());
}

And the result?

Failed
Except: "nd wht is the result?"
But was: "And whtsrl?

Say what?! I read that message two, three times. And then one more. Almost create an issue to this kata 🙂 Intuition said me (and my experience in SQL) that this should work like this:

 

But thanks to reference.microsoft.com I look into the implementation of Except method. What first I noticed was uses of Set and strange ‚if’ condition:

if (set.Add(element)) yield return element;

I got you! This condition check (using default comparer in this case) if an element was already added to the collection. If not – return it. Basically, exclude also own elements. Just try this example:

var letters = new [] { "a", "b", "c", "a", "b", "c"};
var exclude = new [] { "b" };
letters.Except(exclude);

Result:

<enumerable Count: 2>
  [0] = a
  [1] = c

Where is second ‚a’ and ‚c’?

For me, it’s kind of side effect or wrong API (at least method description).  MSDN documentation should have also this kind of example to expose such behaviour. I’ve also run a pool on Twitter. Not so many answers but on a first sight, you can see that it’s not so obvious.

Visual Studio Tip
Thanks to Ref12 extension you can jump directly to reference code using F12 (go to definition). No more „From Metadata” tabs!
News Reporter

1 thought on “C# Linq Except trap!

  1. The Enumerable.Except docs said: https://msdn.microsoft.com/pl-pl/library/bb300779(v=vs.110).aspx
    Produces the *set difference* of two sequences by using the default equality comparer to compare values.

    I agree that the method name is misleading, it should be rather named „Complement” (https://en.wikipedia.org/wiki/Set_(mathematics)#Complements), or even „SetDifference”. This would better hint that this is the *set* operation and saves from future suprises.

Dodaj komentarz

Twój adres email nie zostanie opublikowany. Pola, których wypełnienie jest wymagane, są oznaczone symbolem *