Scott Hanselman

Hanselminutes Podcast 102 - Mike Pizzo on the ADO.NET Entity Framework

February 29, '08 Comments [13] Posted in ASP.NET | Learning .NET | LINQ | Podcast | Tools
Sponsored By

figure1 My one-hundred-and-second podcast is up. In this episode, I sit down with Michael Pizzo, the Principal Architect of the ADO.NET Entity Framework. He gets technically down and dirty pretty fast and I get answers to all the hard questions like "Are LINQ to SQL and LINQ to Entities competing?" and "Which one should I use?" A very cool guy and a fun interview that finally set my head straight about the data stack.

Subscribe: Subscribe to Hanselminutes Subscribe to my Podcast in iTunes

If you have trouble downloading, or your download is slow, do try the torrent with µtorrent or another BitTorrent Downloader.

Do also remember the complete archives are always up and they have PDF Transcripts, a little known feature that show up a few weeks after each show.

Telerik is our sponsor for this show.

Check out their UI Suite of controls for ASP.NET. It's very hardcore stuff. One of the things I appreciate about Telerik is their commitment to completeness. For example, they have a page about their Right-to-Left support while some vendors have zero support, or don't bother testing. They also are committed to XHTML compliance and publish their roadmap. It's nice when your controls vendor is very transparent.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

Enjoy. Who knows what'll happen in the next show?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Friday, February 29, 2008 9:13:07 AM UTC
Hi Scott,

How about an interview with a guy from the LINQ-to-SQL team? I've heard quite a bit about the Entity Framework, but how do the guys working on the LINQ-to-SQL project see the relationship between the two products? Would they say something else, different to what Mike says?

Best regards,
Richard
Friday, February 29, 2008 1:53:46 PM UTC
One observation. On the podcast's page it says "ADO.NET Entity framework (formerly Astoria)" which is not correct, because ADO.NET Data Services was formally Astoria, not the Entity framework.
Friday, February 29, 2008 5:17:51 PM UTC
Richard - Great idea!
Stefan - Sorry, copy paste mistake!
Friday, February 29, 2008 5:30:05 PM UTC
Great podcast! I love the "Database to end all databases" discussion.
Friday, February 29, 2008 10:23:40 PM UTC
It seemed that Michael was dodging the performance question as far as how you are able to "tune" the framework itself, are there any benchmarks between using the Entity Framework and using traditional ADO.Net? It seems that with all of the layers of abstraction there would be a significant hit on response times, so much so that for any enterprise level project the complexity savings would not out-weigh the savings on performance.
Marcus King
Friday, February 29, 2008 11:55:40 PM UTC
Great show. I finally grok, I think, the massive flood of database related alphabet soup coming out of MS.

My concern about the proliferation of these technologies is the shear amount of layers between you and what you want.
My application -> Linq -> Linq 2 Entities -> Entities Framework -> ADO.NET -> ADO.NET/Entities Provider -> Database.

I'd be very interested in seeing benchmarks.
Saturday, March 01, 2008 12:17:02 AM UTC
@Marcus and Robert G

I dunno....while it's true that the more we abstract the more cost increases on the activity. But this has been true forever, unless you are writing in Assembly, right on top of the CPU you are abstracting.

But in my mind, in the chain of events the most costly step I would guess is the last...the actual query, all the other steps (pardon me if I am over simplifying) is just code talking to code. So while there are more steps, in comparison to the query cost I think it's largely negligible. And that the benefits outweigh the costs. Which is a fundamental argument for ORM's anyway (which let's face it, Entity Framework is). But having said that there are limits too. :-)

It all comes down though to the right tool for the job...if your app is heavily reliant on extreme response times and performance then abstraction in any amount is probably not acceptable. But if you are writing an app where a hundred milliseconds here and there isn't an issue (which is probably a fair amount of business apps) then it's no biggie in the long run.



Ryan Smith
Saturday, March 01, 2008 1:38:09 AM UTC
Will this work on large tables that are related? You can't compete with sending the query to the database which is designed for this. Are these combinations of entity framework and others truely optimized for quering? In other words, will it slow down on large tables and large volumnes of data? Also from the development environment, managing these hybrid entities and many other forms/views of them by various developers can lead to complexity and maintenance issues specially when a field on the table changes. I wish I had more time to explore and test out these new extensions.
Sunday, March 02, 2008 5:13:05 AM UTC
There is some discussion of performance of the EDF on the ADO.NET blog: http://blog.msdn.com/adonet/

I've recently deployed a smallish EDF web app to a fairly high-traffic site and have been very impressed thus far. Since the small project has gone so well, I'm highly considering it for a larger project with a couple dozen entities. Ease of development and being able to query my data in an ad-hoc manner has been a great time saver.

Currently the designer tools are very good, but don't yet expose all of the features you'll commonly need when designing an entity model. For example, if you have a field that gets automatically populated from the DB that isn't the primary key, you have to edit the XML to set an attribute to pull it back. I'm hoping for a new beta soon that might expose this common scenario in the designer, but there's been no official word other than "it's coming."
Brian Vallelunga
Tuesday, March 11, 2008 5:50:12 AM UTC
Enjoyed the show, great interview Scott!
Friday, March 14, 2008 7:53:54 PM UTC
Excellent show. Listened to it twice! Like everyone else, I think it all sounds great, but am concerned about performance. It would be great to see benchmarks of the same queries of the same database for both Linq to SQL and Linq to Entities. If the difference is significant, you'd hope that the vendors that are on board to create providers for the entity framework would also look into creating implementations of the IQueryable interface for their specific datastores (i.e. Linq to Oracle).

Tom
Friday, March 14, 2008 11:25:53 PM UTC
Thanks to Scott for an enjoyable interview, and for everyone who listened and took the time to send the great feedback. A few comments in response to some of the previous posts...

>>Richard Bushnell writes: How about an interview with a guy from the LINQ-to-SQL team? <snip> Would they say something else, different to what Mike says?

Note that The Data Programmability Team owns both LINQ to SQL and LINQ to Entities so yes, hopefully we’d say the same thing... :-) That said, I was disappointed that we didn’t have more time to talk about some of the cool stuff in LINQ to SQL, and Scott and I have discussed having someone from the team working on the next version of LINQ to SQL talk about that technology in more detail.


>>Blah blah blah Performance blah blah blah… :-)

Several people wrote about performance. As I mentioned, there is overhead for first-time execution in loading metadata, generating a provider query from either an EntitySQL or LINQ expression, and generating the Lightweight CodeGen to construct objects and get/set persistent properties. The good news is that, for repeated execution, especially if you use CompiledQueries, that cost is only paid once; we cache the actual generated store query and execute it directly on subsequent requests. There is still some overhead with materializing results as domain objects, but with lightweight codegen this is as efficient as if you wrote the code to explicitly construct and set properties yourself.

Regarding Adnan’s question on large tables; note that the query actually is always evaluated in the database (which, as you say, is what the database was designed for). On the client we simply take the user query, written in terms of the conceptual model, and expand it to be in terms of the actual storage model, then send the expanded query to the back-end to evaluate. As above, this expansion process is done once for any EntitySQL query or compiled LINQ query and repeated execution simply executes the same query over and over again against the store (which most databases are *really* good at optimizing). In any case, we *never* do filtering, joining, etc. on the client.

I am working on pulling together some benchmarks that we can share on performance, so please stay tuned. The short answer to Tom, however, is that we don’t believe it would be worth it for other relational providers (such as Oracle) to implement their own IQueryable.

Thanks again for listening and taking the time to send great comments!

-Mike
Thursday, April 24, 2008 12:27:51 PM UTC
Performance-wise, I've just rewritten my data-access code for a data import so that it uses stored procs called from partial entity classes. Using the relevant entity framework it took nearly 30 seconds to do the 2-stage import (bare and parsed). Using direct proc calls (wrapped in two separate transactions) it takes less than 10.

However, it took me less than a day to learn how to make use of the entity stuff and write all the code that saves the read data to the DB. It took another 1.5-2 days to rewrite it using the stored procs and I'm a very fast typist. Additionally, data imports tend to have a lot more data in a lot more tables than your average data access method.

Rule of thumb: for heavy data access like that used in a large data import - go back to basics. For everything else use LINQ and the generated entities because it's fast enough and seriously reduces development time.
Andy Polshaw
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.