Sunday, July 6, 2008

Join Operators in LINQ

Join Operators in LINQ

There are two join operators: Join and GroupJoin. Join and GroupJoin provide an alternative strategy to Select and SelectMany.

1. Join

The Join operator performs an inner join, emitting a flat output sequence. The SQL equivalents of JOIN is INNER JOIN .

For a sample, conside the people and roles objects below and see how to fetch the data from both the objects using JOIN.


List people = new List {
{ ID = 1, LastName = "Kannan", FirstName= "Arjun", RoleId = 1},
{ ID = 2, LastName = "Heema", FirstName ="Sekar", RoleId = 2}
{ ID = 3, LastName = "Seema", FirstName ="Sekar", RoleId = 2}
{ ID = 4, LastName = "Jillary", FirstName ="Sekar", RoleId = 3}
};

List roles = new List {
{ ID = 1, RoleDescription = "Manager" },
{ ID = 2, RoleDescription = "Team Leaeder" },
{ ID = 3, RoleDescription = "Developer" }
};

var query = from p in people
join r in roles on p.RoleId equals r.ID
select new { p.FirstName, p.LastName, r.RoleDescription };

ObjectDumper.Write(query);


The output is
    LastName = Kannan  FirstName = Arjun  RoleDescription = Manager
LastName = Heema FirstName = Sekar RoleDescription = Team Leaeder
LastName = Seema FirstName = Sekar RoleDescription = Team Leaeder
LastName = Jillary FirstName = Sekar RoleDescription = Developer
2. GroupJoin

GroupJoin does the same work as Join, but instead of yielding a flat result, it yields a hierarchical result, grouped by each outer element. It also allows left outer joins. The SQL equivlant are INNER JOIN, LEFT OUTER JOIN .

The comprehension syntax for GroupJoin is the same for Join, but it is followed by the into keyword.

For an example, take the same object initialized above with little changes. Now we will see how to use Group JOIN.

List people = new List {
{ ID = 1, LastName = "Kannan", FirstName= "Arjun", RoleId = 1},
{ ID = 2, LastName = "Heema", FirstName ="Sekar", RoleId = 2}
{ ID = 3, LastName = "Seema", FirstName ="Sekar", RoleId = 2}
{ ID = 4, LastName = "Jillary", FirstName ="Sekar", RoleId = 4}
};

List roles = new List {
{ ID = 1, RoleDescription = "Manager" },
{ ID = 2, RoleDescription = "Team Leaeder" },
{ ID = 3, RoleDescription = "Developer" }
};

var query = from p in people
join r in roles on p.RoleId equals r.ID into pr
from r in pr.DefaultIfEmpty()
select new {
p.FirstName,
p.LastName,
RoleDescription = r == null ? "No Role" : r.RoleDescription
};


The output is
    LastName = Kannan  FirstName = Arjun  RoleDescription = Manager
LastName = Heema FirstName = Sekar RoleDescription = Team Leaeder
LastName = Seema FirstName = Sekar RoleDescription = Team Leaeder
LastName = Jillary FirstName = Sekar RoleDescription = No Role

In the code above, the join … into query expression is used to group the join into a new sequence called pr. Since the new element we introduced in the people sequence has a role identifier that doesn’t correspond to any of Role elements in the roles sequence, an empty element is returned.
In role description, the RoleId 4 is undefined so that the we need to pass 'No Role' as result.

Using the DefaultIfEmpty method, we can replace each empty element with the given ones. In this case no parameter has been provided, so the empty element will be replaced with a null value. By checking this value in the select command we can provide a custom description ("No Role" in our case) when the code encounters null elements.

Note:

  • The advantage of Join and GroupJoin is that they execute efficiently over local in-memory collections because they first load the inner sequence into a keyed lookup, avoiding the need to repeatedly enumerate over every inner element.
  • The disadvantage is that they offer the equivalent of inner and left outer joins only; cross joins and non-equi joins must still be done with Select /SelectMany. With LINQ to SQL queries, Join and GroupJoin offer no real benefits over Select and SelectMany.

Click here to see more operator available in LINQ.

Happy coding!!!