Scott Hanselman

Most Common ASP.NET Support issues - Reporting from deep inside Microsoft Developer Support

May 18, '10 Comments [21] Posted in ASP.NET | Bugs
Sponsored By

Microsoft Developer Support or ("CSS" - Customer Support Services) is where you're sent within Microsoft when you've got problems. They see the most interesting bugs, thousands of issues and edge cases and collect piles of data. They report this data back to the ASP.NET team (and other teams) for product planning. Dwaine Gilmer, Principal Escalation Engineer, and I thought it would be interesting to get some of that good internal information out to you, Dear Reader. With all those cases and all the projects, there's basically two top things that cause trouble in production ASP.NET web sites. Long story short, Debug Mode and Anti-Virus software.

Thanks to Dwaine Gilmer, Doug Stewart and Finbar Ryan for their help on this post! It's all them!

#1 Issue - Configuration

Seems the #1 issue in support for problems with ASP.NET 2.x and 3.x is configuration.

Symptoms

Notes

  • OOM
  • Performance
  • High memory
  • Hangs
  • Deadlocks

There are more debug=true cases than there should be.

People continue to deploy debug versions of their sites to production. I talked about how to automatically transform your web.config and change it to a release version in my Mix talk on Web Deployment Made Awesome. If you want to save yourself a headache, release with debug=false.

Additionally, if you leave debug=true on individual pages, note that this will override the application level setting.

Here's why debug="true" is bad. Seriously, we're not kidding.

  • Overrides request execution timeout making it effectively infinite
  • Disables both page and JIT compiler optimizations
  • In 1.1, leads to excessive memory usage by the CLR for debug information tracking
  • In 1.1, turns off batch compilation of dynamic pages, leading to 1 assembly per page.
  • For VB.NET code, leads to excessive usage of WeakReferences (used for edit and continue support).

An important note: Contrary to what is sometimes believed, setting retail="true" in a <deployment/> element is not a direct antidote to having debug="true"!

#2 Issue - Problems with an External (non-ASP.NET) Root Cause

Sometimes when you're having trouble with an ASP.NET site, the problem turns out to not be ASP.NET itself. Here's the top three issues and their causes. This category are for cases that were concluded because of external reasons and are outside of the control of support to directly affect. The sub categories are 3rd party software, Anti-virus software, Hardware, Virus attacks, DOS attacks, etc.

If you've ever run a production website you know there's always that argument about whether to run anti-virus software in production. It's not like anyone's emailing viruses and saving them to production web servers, but you want to be careful. Sometimes IT or security insists on it. However, this means you'll have software that is not your website software trying to access files at the same time your site is trying to access them.

Here's the essence as a bulleted list

  • Concurrency while under pressure: This causes problems in big software. Make sure your anti-virus software is configure appropriately and that you're aware of which processes are accessing which files, as well as how, why and when
  • Profile your applications: .NET and the Web are not black boxes. You can see what's happening if you look. Know what bytes are going out the wire. Know who is accessing the disk. Measure twice, cut once, they say? I say measure a dozen times. You'd be surprised how often folks put an app in production and they've never once profiled it.
  • Anti-Virus Software: It can't be emphasized enough that site owners should ensure they are running the latest AV engine and definitions from their chosen anti-malware vendor. They've see folks hitting hangs due to flakey AV drivers that are over two years out of date.  Another point about AV software is that it is not just about old-school AV scanning of file access. Many products now do low level monitoring of port activity, script activity within processes and memory allocation activity and do not always do these things 100% correctly. Stay up to date!
  • Know where you're calling out to: Also, connection to remote endpoints: calling web services, accessing file systems etc. All of this can slow you down if you're not paying attention. Is your DNS correct? Did you add your external hosts to a hosts file to remove DNS latency? 
  • processModel autoconfig=true: This is in machine.config and folks always mess with it. Don't assume that you know better than the defaults. Everyone wants to change the defaults, add threads, remove threads, change the way the pool works because they think their textboxes-over-data application is special. Chances are it's not, and you'd be surprised how often people will spend days on the phone with support and discover that the defaults were fine and they had changed them long ago and forgotten. Know what you've changed away from the defaults, and know why. Don't program by coincidence.

...and here's the table of details:

Issue

Product

Description

Symptoms

Notes

Anti-virus software

All

Anti-virus software is installed onto Servers and causes all kinds of problems. 

  • Application restarting
  • Slow performance
  • Session variable are null
  • Cannot install hotfix
  • Intermittent time outs
  • High memory
  • Session lost
  • IDE Hangs
  • Deadlocks

This consists of all AV software reported by our customers. All cases do not report the AV software that is being used so the manufacturer is not always known. 

KB821438, KB248013, KB295375, KB817442

3rd party Vendors

All

This is a category of cases where the failure was due to a 3rd party manufacturer.

  • Crash
  • 100% CPU
  • High memory
  • Framework errors
  • Hang

The top culprits are 3rd party database systems, and 3rd party internet access management systems.

Microsoft component

All

Microsoft software

  • Intermittent time outs
  • High memory
  • Deadlocks
  • 100% CPU
  • Crash

Design issues that cause performance issues like sprocs, deadlocks, etc. Profile your applications and the database! (Pro tip: select * from authors doesn't scale.) Pair up DBAs and programmers and profile from end to end.

Spread the word! What kinds of common issues do YOU run into when running production sites, Dear Reader?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Tuesday, May 18, 2010 9:33:10 PM UTC
I have seen cases where improper ACLs have caused apps to misbehave (especially when running the app pool under a different identity).

These are good reads:
http://msdn.microsoft.com/en-us/library/ff649309.aspx
http://msdn.microsoft.com/en-us/library/ms178699.aspx
Tuesday, May 18, 2010 9:35:42 PM UTC
And what is yours or Microsoft's recommendation on installing AV on production web/database servers if there isn't a security team insisting on it?
Tuesday, May 18, 2010 9:38:02 PM UTC
I don't personally like the idea of AV software on web servers. I prefer to have lots of backups and if I find there's a files, I'd remove the machine. I don't know what MSFT's opinion is. I'd *guess* they'd say use a to vendor, exclude the website, and keep it up to date.
Tuesday, May 18, 2010 9:53:04 PM UTC
Re: Debug=true

One thing that really, really bothers me about IIS6 and 7 is just how much work it is to speed up your website. IIS7 needlessly moved things all over the place, when in reality all most people really want is an "Optimize" tab. Throw "prevent debug=true", all Caching, HTTP Header editing and Gzipping/Compression in one place so it's a one stop shop.

Call it "Uber Retail" setting if you want, but just don't make me go to the Machine.Config to do it, let me do it on a site-by-site basis, or if I choose apply it to the entire server.

Andrew
Tuesday, May 18, 2010 10:23:07 PM UTC
In Visual Studio using the "Copy Web Site" feature.
Having a checkmark in [x]Passive Mode

Removing that removes a lot of FTP connection issues with certain routers.
This is one case where "Works on my computer" may not be working on the others. ;-)
Tuesday, May 18, 2010 10:26:25 PM UTC
I'm guilty of the debug=true issue ... sad how easy it is to forget that its even there in the config file :( . i second the "Optimize" tab and the "prevent debug=true" :)
Wednesday, May 19, 2010 12:24:34 AM UTC
What about image processing on a web server? Any tips for that? Perhaps there is some internal documentation or support resolutions floating around in CSS that hasn't been made public yet? Or perhaps it has and I just haven't found it yet :).

We have a need to resize images on the fly (we also need to apply some other effects as well) but the ,Net documentation specifically says the System.Drawing namespace shouldn't be used in ASP.Net but they don't mention what should be used instead.

We currently do image processing in classic ASP (we bought a 3rd party component to do it) and occasionally get OOM errors (plenty of RAM, not sure what the error actually refers to). We're in the process of converting our application to Asp.Net (MVC2) and I'm concerned that we will have the same problems or, considering the warning, possibly worse problems when we do.
Wednesday, May 19, 2010 1:48:10 AM UTC
Wednesday, May 19, 2010 9:10:03 AM UTC
Scott,

We always end up deploying debug versions of our platform libraries because we need to reference debug versions during development, and don't have an easy way to switch all the references to release versions and back for builds. And no, we can't change them all the project references.

Do you know a way to resolve this?
Jon
Wednesday, May 19, 2010 11:32:57 AM UTC
One large issue that stood out for me was dealing with a user base that worked entirely in a Citrix environment.

For some reason, when browsing an ASP.NET site with a large view state (Not out of control, just large) it would get corrupted every time. The fix was simple in the end, but it took a lot of banging of my head against the wall to figure it out.

All I did was break up the view state by setting maxPageStateFieldLength="5120" and viola everything was all better.
Wednesday, May 19, 2010 5:33:45 PM UTC
Jon,
We handle different build configurations using Nant scripts and CruiseControl to call the scripts. Since you can pass the configuration to msbuild we can call it with an execute command and change the build configuration per project, like this:

<exec program="C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\msbuild.exe" commandline='/nologo /p:Configuration=Release "<project path here>" />

Bryan Rhea
Wednesday, May 19, 2010 6:08:26 PM UTC
Thanks @Raj. That seems to be exactly what I was looking for.
Thursday, May 20, 2010 3:36:22 PM UTC
Scott

What exactly do you mean with this line:

> An important note: Contrary to what is sometimes believed, setting retail="true" in a <deployment/> element is not a direct antidote to having debug="true"!

Setting deployment retail to "true" will disable debugging at all levels in ASP.NET Runtime 2.0 and higher. So would you care to explain on what you mean with this statement?

Cheers
Friday, May 21, 2010 9:59:52 AM UTC
@Bryan

Yeah, we have a build process that changes the build configuration, but that doesn't address the problem I described. If you add an assembly reference to a library there is no way, as far as I can say, to say "use the debug version of this library in development, then switch to the release version when you produce a live build".
Jon
Saturday, May 22, 2010 8:55:22 AM UTC
Gabriel, this from the team:

"For a normal debug=”false” page or site, request execution timeout is 110s or whatever alternative timeout you have specified. If you deploy with debug=”true” the timeout is effectively disabled (it is actually set to about 5 hours). The intention of this is that when you are debugging you don’t want the request timing out while you are debugging it.
Setting retail=”true” does reverse some of the debug compilation behaviour that results from having debug=”true” but it will not revert the ASP.NET runtime to enforcing the correct timeout. So if you have debug=”true” in production and your ASP.NET page happens to call something that blocks indefinitely (such as a wayward web service or database stored procedure) the ASP.NET request is not going to timeout for a very, very long time. You would be dependent on the thing you are calling timing out."
Sunday, May 23, 2010 12:34:51 AM UTC
Well in that case, this is very poorly documented and the possibility to use the element <deployment retail="true"/> should either be removed from ASP.NET or this value should override the value specified for the debug element in web.config. IMHO There should semantically be no difference between having debug="false" in web.config and having debug="true" in web.config + deployment retail="true" in machine.config.

At the same time, some parts of the MSDN documentation needs to be changed, because on same pages the technical writer explicitly refers to the debug element in the web.config but in many cases the technical writer just mentions "this is only true when ASP.NET debugging is disabled". So when reading this, we basically don't know anymore whether or the technical writer means "this is only true when the debug attribute is set to false". So I vote for removing the possibility to use <deployment retail="true"/> and never look back again...
Monday, May 24, 2010 8:27:29 AM UTC
Scott,

Can you expand on "profile your applications". What is this and how do you do it?

Tuesday, May 25, 2010 11:06:40 AM UTC
Scott,

I would also like to know about how to profile ones web application. Any tips or hints? Some pointers would be deeply appreciated!
Matts B
Thursday, June 03, 2010 2:21:42 PM UTC
Scott,

Can you please explain more on how to profile web applications and databases. Thanks!
Friday, June 18, 2010 1:26:38 PM UTC
Hi Scott,

Here is another reason why debug=true is evil on production sites :
Scripts and images downloaded from the WebResources.axd handler are not cached

I got it from Guthrie's post : http://weblogs.asp.net/scottgu/archive/2006/04/11/442448.aspx

(The Gu also advocated retail=true without warning that it does NOT enable the timeout again.)

A Fan, Tom
Tom Pester
Wednesday, July 07, 2010 10:51:42 PM UTC
@Bryan

Yeah, we have a build process that changes the build configuration, but that doesn't address the problem I described. If you add an assembly reference to a library there is no way, as far as I can say, to say "use the debug version of this library in development, then switch to the release version when you produce a live build".
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.