Using a straightforward Distinct() would work if the elements share the same memory address.
using System; using System.Linq; using System.Collections.Generic; public class Test { public static void Main() { var rose = new Fact { Thing = "Roses are", Color = 0xFF0000 }; var violet = new Fact { Thing = "Violets are", Color = 0x0000FF }; var sugar = new Fact { Thing = "Sugar is", Color = 0x1337 }; var list = new List<Fact> () { rose, rose, violet, sugar, sugar }; foreach (var item in list.Distinct()) { Console.WriteLine ("{0:x} {1:x}", item.Thing, item.Color); } } } public class Fact { public string Thing { get; set; } public int Color { get; set; } }
Live code: http://ideone.com/psEvjS
Output:
Roses are ff0000 Violets are ff Sugar is 1337
In real application, it is not that simple, even some elements have the same content, those elements are not slotted to same memory, by default the Distinct() would not work on that kind of scenario. To wit, the following doesn't work:
using System; using System.Linq; using System.Collections.Generic; public class Test { public static void Main() { var list = new List<Fact> () { new Fact { Thing = "Roses are", Color = 0xFF0000 }, new Fact { Thing = "Roses are", Color = 0xFF0000 }, new Fact { Thing = "Violets are", Color = 0x0000FF }, new Fact { Thing = "Sugar is", Color = 0x1337 }, new Fact { Thing = "Sugar is", Color = 0x1337 }, }; foreach (var item in list.Distinct()) { Console.WriteLine ("{0:x} {1:x}", item.Thing, item.Color); } } } public class Fact { public string Thing { get; set; } public int Color { get; set; } }
Live code: http://ideone.com/58gynx
Thus the above Distinct() incorrectly returns duplicate objects:
Roses are ff0000 Roses are ff0000 Violets are ff Sugar is 1337 Sugar is 1337
You can use anonymous type on Distinct though. Anonymous types has implicit implementation of Equals and GetHashCode, that's why Distinct() on anonymous type can detect objects with similar content:
using System; using System.Linq; using System.Collections.Generic; public class Test { public static void Main() { var list = new List<Fact> () { new Fact { Thing = "Roses are", Color = 0xFF0000 }, new Fact { Thing = "Roses are", Color = 0xFF0000 }, new Fact { Thing = "Violets are", Color = 0x0000FF }, new Fact { Thing = "Sugar is", Color = 0x1337 }, new Fact { Thing = "Sugar is", Color = 0x1337 }, }; foreach (var item in list.Select(x => new { x.Thing, x.Color } ).Distinct()) { Console.WriteLine ("{0} {1}", item.Thing, item.Color); } } } public class Fact { public string Thing { get; set; } public int Color { get; set; } }
Live code: http://ideone.com/8K79Dn
Hence that shall return distinct elements:
Roses are ff0000 Violets are ff Sugar is 1337
However, you cannot use that technique when you need to return a list from a function, a function requires a non-anonymous type. Thus you cannot do the following, this will have a compilation error:
public static IEnumerable<Fact> GetFacts(IEnumerable<Fact> list) { return list.Select(x => new { x.Thing, x.Color } ).Distinct(); }
Live Code: http://ideone.com/1v6x0i
The following works, it felt kludgy and DRY-violating though:
public static IEnumerable<Fact> GetFacts(IEnumerable<Fact> list) { return list.Select(x => new { x.Thing, x.Color }).Distinct() .Select(x => new Fact { Thing = x.Thing, Color = x.Color } ); }
Live Code: http://ideone.com/N4iKTx
Output:
Roses are ff0000 Violets are ff Sugar is 1337
So what does a good developer shall do to be able to write an elegant code? The dev shall learn how to do a proper object similarity comparison. Do you remember the implicit implementation of Equals and GetHashCode on anonymous type I mentioned a while ago? That's what we will do, we will do it explicitly on the concerning class. To wit, we need to have this code on Fact class:
public override bool Equals (object obj) { if (obj == null) return false; var x = obj as Fact; if (x == null) return false; return this.Thing == x.Thing && this.Color == x.Color; } public override int GetHashCode () { // http://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode unchecked { int hash = 17; hash = hash * 23 + this.Thing.GetHashCode (); hash = hash * 23 + this.Color.GetHashCode (); return hash; } }
When we use Distinct() on that class, that returns the distinct elements even each object has different memory addresses:
Live Code: http://ideone.com/eMrO4y
Output:
Roses are ff0000 Violets are ff Sugar is 1337
That's it folks!
Happy Coding! ツ
No comments:
Post a Comment