Remember good developers don't just write source code, they also READ it. You don't just become a great poet by writing lots of poems. Read and absorb as well. Do check out the Source Code category of my blog here, there is (as of today) 15 pages of posts on Source Code you can check out.
Recently my friend Jonathan Carter (OData Dude, my name for him) was working with a partner on some really weird stuff that was happening with a LINQ to SQL query. Remember that every abstraction sometimes leaks and that the whole port of an abstraction is "raise the level" so you don't have to worry about something.
Plumbing is great because it abstracts away water delivery. For all I know, there's a dude with a bucket who runs to my house when I turn on the tap. Doesn't matter to me, as long as I get water. However, sometimes something goes wrong with that dude, and I don't understand what's up with my water. This happened to JC and this partner.
In this example, we're using the AdventureWorks Sample Database to make this point. Here's some sample code the partner sent us to reproduce the weirdness.
protected virtual Customer GetByPrimaryKey(Func<customer, bool> keySelection)
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
return (from r in context.Customers select r).SingleOrDefault(keySelection);
}
[TestMethod]
public void CustomerQuery_Test_01()
{
Customer customer = GetByPrimaryKey(c => c.CustomerID == 2);
}
[TestMethod]
public void CustomerQuery_Test_02()
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
Customer customer = (from r in context.Customers select r).SingleOrDefault(c => c.CustomerID == 2);
}
CustomerQuery_Test_01 calls the GetByPrimaryKey method. That method takes a Func as a parameter. He's actually passing in a lamdba expression into the GetByPrimaryKey function. That makes the method reusable and is the beginning of some nice helper functions for his DAL (Data Access Layer). He's split up the query into two places. Seems reasonable, right?
Well, if you run this in Visual Studio - and in this example, I'll use the Intellitrace feature to see the actual SQL that was executed, although you can also use SQL Profiler - we see:
Here's the query in text:
SELECT [t0].[CustomerID], [t0].[NameStyle], [t0].[Title],
[t0].[FirstName], [t0].[MiddleName], [t0].[LastName],
[t0].[Suffix], [t0].[CompanyName], [t0].[SalesPerson],
[t0].[EmailAddress], [t0].[Phone], [t0].[PasswordHash],
[t0].[PasswordSalt], [t0].[rowguid], [t0].[ModifiedDate]
FROM [SalesLT].[Customer] AS [t0]
Um, where's the WHERE clause? Will LINQ to SQL kill my pets and cause me to lose my job? Does Microsoft suck? Let's take a look at the second query, called in CustomerQuery_Test_02():
SELECT [t0].[CustomerID], [t0].[NameStyle], [t0].[Title],
[t0].[FirstName], [t0].[MiddleName], [t0].[LastName],
[t0].[Suffix], [t0].[CompanyName], [t0].[SalesPerson],
[t0].[EmailAddress], [t0].[Phone], [t0].[PasswordHash],
[t0].[PasswordSalt], [t0].[rowguid], [t0].[ModifiedDate]
FROM [SalesLT].[Customer] AS [t0]
WHERE [t0].[CustomerID] = @p0
OK, there it is, but why does the second LINQ query cause a WHERE clause to be emitted but the first doesn't? They look like basically the same code path, just one is broken up.
The first query is clearly returning ALL rows to the caller, which then has to apply the LINQ operators to do the WHERE in memory, on the caller. The second query is using the SQL Server (as it should) to do the filter, then returns WAY less data.
Here's the deal. Remember that LINQ cares about two things, IEnumerable stuff and IQueryable. The first lets you foreach over a collection, and the other includes all sorts of fun stuff that lets you query that stuff. Folks build on top of those with LINQ to SQL, LINQ to XML, LINQ to YoMomma, etc.
When you are working with something that is IQueryable; that is, the source is IQueryable, you need to make sure you are actually usually the operators for an IQueruable, otherwise you might fall back onto an undesirable result, as in this database case with IEnumerable. You don't want to return more data from the database to a caller than is absolutely necessary.
From JC, with emphasis mine:
The IQueryable version of SingleOrDefault, that takes a lambda, actually takes an Expression>, whereas the IEnumerable version, takes a Func. Hence, in the below code, the call to SingleOrDefault, is treating the query as if it was LINQ To Objects, which executes the query via L2S, then performs the SingleOrDefault on the in memory collection. If they changed the signature of GetByPrimaryKey to take an Expression>, it would work as expected.
What's a Func and what's an Expression? A Func<> (pronounced "Funk") represents a generic delegate. Like:
Func<int,int,double> divide=(x,y)=>(double)x/(double)y;
Console.WriteLine(divide(2,3));
And an Expression<> is a function definition that can be compiled and invoked at runtime. Example"
Expression<Func<int,int,double>> divideBody=(x,y)=>(double)x/(double)y;
Func<int,int,double> divide2=divideBody.Compile();
write(divide2(2,3));
So, the partner doesn't want a Func (a Func that takes a customer and returns a bool, they want a compliable Expression with a Func that takes a Customer and returns a bool. I'll have to add "using System.Linq.Expressions;" as well.
protected virtual Customer GetByPrimaryKey(Expression<Func<customer,bool>> keySelection)
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
return (from r in context.Customers select r).SingleOrDefault(keySelection);
}
[TestMethod]
public void CustomerQuery_Test_01()
{
Customer customer = GetByPrimaryKey(c => c.CustomerID == 2);
}
[TestMethod]
public void CustomerQuery_Test_02()
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
Customer customer = (from r in context.Customers select r).SingleOrDefault(c => c.CustomerID == 2);
}
Just changed that one line, so that GetByPrimaryKey takes a Expression> and I get the SQL I expected:
Someone famous once said, "My code has no bugs, it runs exactly as I wrote it."
Layers of Abstraction are tricky, and you should always assert your assumptions and always look at the SQL that gets generated/created/executed by your DAL before you put something into production. Trust no one, except the profiler.
Hosting By