Scott Hanselman

Release IS NOT Debug: 64bit Optimizations and C# Method Inlining in Release Build Call Stacks

October 25, '07 Comments [14] Posted in ASP.NET | Learning .NET | Microsoft | Programming
Sponsored By

Just a reminder: Release builds are not Debug builds. Seems obvious, but it's worth saying again. Release builds are optimized for speed and debug builds are optimized for, well, debug-ability. However, most of the optimizations in managed code are done by the JIT compiler rather than the language compiler.

However, sometimes in our zealous attempts to make Release builds easier to debug we can inadvertently (or totally "vertently") remove a lot of the optimizations that make Release builds faster.

Introduction

I was talking to Jeremy and he had come upon some code that was inserting a try{} catch() { throw; } block in every method. The rationale was that "they wanted the complete stack trace in their logging after an exception was thrown." That's a noble goal, but their technique of adding these guards is what's called an "anti-pattern" or as Egon would say "it would be bad." An anti-pattern is "a pattern that tells how to go from a problem to a bad solution."

First, let's start by exploring why they aren't getting "a complete stack trace" before we tackle why they believe they want/need a complete one.

Here's a simple program:

using System;

class NormalProgram
{
    static void Main(string[] args)
    {
        try
        {
            methodA();
        }
        catch (System.Exception e)
        {
            Console.WriteLine(e.ToString());
        }
    }
    static void methodA() { methodB(); }
    static void methodB() { methodC(); }
    static void methodC() { badMethod(); }
    static void badMethod() { throw new ApplicationException("generic bad thing"); }
}

In this program, Main calls A, then A calls B, which calls C, which calls badMethod which throws an exception. That exception is caught up at Main. So far, makes sense right?

We'll compile two versions, one in Debug mode and one in Release and run them on a regular 32-bit machine. Notice we're saying /o for enable optimizations for the Release build. We are making debug symbols for Release via /debug:pdbonly, but it doesn't affect performance in a significant way.

"%FXROOT%\csc.exe" /t:exe /out:NormalRelease.exe /debug:pdbonly /o NormalProgram.cs
"%FXROOT%\csc.exe" /t:exe /out:NormalDebug.exe /debug NormalProgram.cs

We run Debug on 32-bit and we see:

System.ApplicationException: generic bad thing
   at NormalProgram.badMethod() in NormalProgram.cs:line 24
   at NormalProgram.methodC() in NormalProgram.cs:line 23
   at NormalProgram.methodB() in NormalProgram.cs:line 21
   at NormalProgram.methodA() in NormalProgram.cs:line 19
   at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Looks good, makes sense. Now we run Release on 32-bit:

System.ApplicationException: generic bad thing
   at NormalProgram.badMethod() in NormalProgram.cs:line 24
   at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Our call stack has been "collapsed"? Does it make it harder to debug? Not really, because we knew where we ended up and the path between the two is clear (otherwise this "inlining" wouldn't have happened.)

However, it is confusing and there's a perception that "information is missing." It's important to note that information isn't missing but rather this stack trace is showing the runtime reality. What you write (the Programmer's Intent) isn't exactly what runs, especially when things are optimized. As programmers we are doing our best to tell the machine what to do and it is doing its best to do that fast and correct - let it do what it does best. When we see a stack like this our first reaction is that information is there, but has been hidden. In fact, the code we wrote ran, but not every method got their own local stack frame as they were optimized into fewer methods.

Inlining

Disclaimer: I'm showing you this for informational purposes. Doing this without thinking or having a good reason is usually  a bad idea and could confuse other issues. The only reason you might want to do this would be if you were doing some kind of funky thing that made your method require its own stack space. DON'T ADD THIS ATTRIBUTE WILLY-NILLY. Don't Program By Coincidence. Remember the point of this post is that Release and Debug are different. Use them differently.

What if I tell the runtime JIT-ter to not inline? I'll add one line and one using:

using System;
using System.Runtime.CompilerServices;

class NormalProgram
{
    static void Main(string[] args)
    {
        try
        {
            methodA();
        }
        catch (System.Exception e)
        {
            Console.WriteLine(e.ToString());
        }
    }
    [MethodImpl(MethodImplOptions.NoInlining)]
    static void methodA() { methodB(); }
    static void methodB() { methodC(); }
    static void methodC() { badMethod(); }
    static void badMethod() { throw new ApplicationException("generic bad thing"); }
}

And run on 32-bit:

System.ApplicationException: generic bad thing
   at NormalProgram.badMethod() in NormalProgram.cs:line 24
   at NormalProgram.methodA() in NormalProgram.cs:line 19
   at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Well, that stopped the inlining of method A, but of course B and C are still inlined. You can see how this is going to get icky.

ASIDE: We as programmers have all reached this moment. This is the moment when we decide whether or not to slap an attribute on every method in our program. (This attribute is just an example, but you know this moment...there's still time to actually stop and try to understand the problem...)

...but we have to ship. So, let's try this:

using System;
using System.Runtime.CompilerServices;

class NormalProgram
{
    [MethodImpl(MethodImplOptions.NoInlining)] 
    static void Main(string[] args)
    {
        try
        {
            methodA();
        }
        catch (System.Exception e)
        {
            Console.WriteLine(e.ToString());
        }
    }
    [MethodImpl(MethodImplOptions.NoInlining)]
    static void methodA() { methodB(); }
    [MethodImpl(MethodImplOptions.NoInlining)] 
static void methodB() { methodC(); } [MethodImpl(MethodImplOptions.NoInlining)] static void methodC() { badMethod(); } static void badMethod() { throw new ApplicationException("generic bad thing"); } }

And compile and run as Release on 32-bit...

System.ApplicationException: generic bad thing
   at NormalProgram.badMethod() in NormalProgram.cs:line 24
   at NormalProgram.methodC() in NormalProgram.cs:line 23
   at NormalProgram.methodB() in NormalProgram.cs:line 21
   at NormalProgram.methodA() in NormalProgram.cs:line 19
   at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Now we're back where we were before, right? Well, not really. We've got the full stack again, but we've neutered the build such that the JITter can't really do it's job.

Plus, we haven't tried 64-bit yet. Let's run the Release build on 64-bit:

System.ApplicationException: generic bad thing
   at NormalProgram.methodC() in NormalProgram.cs:line 23
   at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Weird. One conclusion to draw would be that 64-bit CLR ignores method inlining attributes. That's not necessarily the case. Another would be that 64-CLR is more aggressive about inlining. The deal is that there are MANY optimizations that the JITter can to. MANY. They wouldn't give me a number because it's Special Sauce, but le's just say it's not one or two. It's more than that. ;)

Inlining isn't the only thing that can "collapse" a call stack like this. This also isn't exactly true. In this case, remember that x64 is a whole new processor architecture while x86 has been around for, let's say, a while. Turns out that x64 really likes to optimize and this is an example of a "Tail Call Optimization" rather than explicit Method Inlining.

Be sure to read David Broman's detailed blog post on exactly x64 tail-call optimizes and when it doesn't.

A good explanation from The Original Wiki at the Portland Pattern Wiki:

"Tail-call optimization (or tail-call merging or tail-call elimination) is a generalization of TailRecursion: If the last thing a routine does before it returns is call another routine, rather than doing a jump-and-add-stack-frame immediately followed by a pop-stack-frame-and-return-to-caller, it should be safe to simply jump to the start of the second routine, letting it re-use the first routine's stack frame (environment)...."

"However, TailCallOptimization has some drawbacks. The runtime environment will have a confusing stack during execution of tail-called routines, which can make debugging difficult ("How did I get here? baz() never even calls foo()!")."

The wiki explanation was written with C in mind, but the concept is universal and it's the "confusing stack during execution of tail-called routines" that's got me writing this blog post today.

There's some controversy on whether Tail Call Optimizations are a subset of Method Inlining. That may never be resolved, but for now they are discrete and different. Eric Gunnerson has a good article on some of the heuristics that the JIT uses in its inlining decision making process.

For some hardcore details, be sure to read David Notario's excellent series on x86 JIT inlining from 2004, both Part 1 and Part 2. Perhaps start by reading his "The CLR x86 JIT, an overview" as he explains seven basic stages the x86JIT.

An Anti-Pattern

Now, back to our customer from the beginning who was inserting a try{} catch() { throw; } block in every method. Let's try that (don't do this at home):

using System;

class UglyProgram
{
    static void Main(string[] args)
    {
        try
        {
            methodA();
        }
        catch (Exception e)
        {
            Console.WriteLine(e);                
        }
    }
    static void methodA() { try { methodB(); } catch { throw; } }
    static void methodB() { try { methodC(); } catch { throw; } }
    static void methodC() { try { badMethod(); } catch { throw; } }
    static void badMethod(){throw new ApplicationException("generic bad thing");}
}

...and compile both Debug and Release and this produces the same output on both 32-bit and 64-bit:

System.ApplicationException: generic bad thing
   at UglyProgram.badMethod() in UglyProgram.cs:line 20
   at UglyProgram.methodC() in UglyProgram.cs:line 19
   at UglyProgram.methodB() in UglyProgram.cs:line 18
   at UglyProgram.methodA() in UglyProgram.cs:line 17
   at UglyProgram.Main(String[] args) in UglyProgram.cs:line 10

Why? Because by setting up the try's you're introducing the potential for other ways out of each method so the JIT'ter can't optimize anything that would collapse the call stack like we saw earlier. This isn't a good pattern as it doesn't deal with the underlying issue in a simple way. The customer wants to be able to debug Release builds easier.

Options

Here's some options we could try:

  • Try/catch every method - Not a good idea. Not only are you cluttering your code, but you're paying a cost all the time when you only need to debug in Production/Release occasionally.
  • Put [MethodImpl(MethodImplOptions.NoInlining)] on every method - Also not a good idea, not only because of the operative "every method" violating the DRY Principle but also because it only deals with one specific optimization out of many.
  • Turn off Optimizations in the compiler - This particular kind of inlining appears to be a JITter thing, not a language compiler thing so turns out that doesn't work in this instance.

Ok, so the customer wants a fuller call stack. I would propose that they in fact don't want this. One of my old bosses used to say:

"Customer walks in with a cell phone and says 'this thing needs a bigger antenna.' We have to ask ourselves does he want a bigger antenna or better cell phone reception" - Mark Klein

This is a great pithy analogy. Everyone comes in with both Problems and Solutions. This customer wants to be able to debug Release builds. They surely don't want full call stacks in Release Mode if it means turning off all optimizations.

Best Solution

One thing the customer could do that would give them the best of both worlds is something Mark Pearce told me about a while back, the [.NET Framework Debugging Control] section of an .ini file you've probably never used:

This JIT configuration has two aspects:

  • You can request the JIT-compiler to generate tracking information. This makes it possible for the debugger to match up a chain of MSIL with its machine code counterpart, and to track where local variables and function arguments are stored.
  • You can request the JIT-compiler to not optimize the resulting machine code.

If you have a file Foo.exe you can create a Foo.ini with these contents:

[.NET Framework Debugging Control]
GenerateTrackingInfo=1
AllowOptimize=0

Implementation Detail: Why is it an INI file? Well, the .NET runtime hasn't started up yet, so we don't have all that XML-parsey goodness lying around. Instead they need to use the Win32 method GetPrivateProfileString() in order to retrieve those.

This solution assumes you compiled with /debug:pdbonly in Release mode in order to generate the full stack when needed.

DICSLAIMER: This isn't something you want to put into production! You'd only use this for debugging. It's something you'd enable on a case-by-case basis.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Thursday, October 25, 2007 8:13:12 PM UTC
Nice writeup!

Posted in ______. You forgot some tags.
Thursday, October 25, 2007 8:27:46 PM UTC
Scott, "advertently" is a word, it's just note widely used. "Vertently" certainly sounds better though.
Matt
Thursday, October 25, 2007 10:42:57 PM UTC
Wonderful article! I didn't know the inlining was NOT done in Debug mode..

Good to know!
Friday, October 26, 2007 1:28:45 AM UTC
One guy in my team always use Release Build to do debugging. I must let him read this post today. Nice post.
Friday, October 26, 2007 8:49:24 AM UTC
Well thought out and elequantly explained article! Nice post mate, my new "learn" for the day ;)
Damian Gonsalves
Friday, October 26, 2007 9:00:48 AM UTC
Damien - Thanks, that's a very nice thing to say.
Friday, October 26, 2007 3:49:01 PM UTC
Excellent post scott!!

P. S. there is one good use for disallowing method inlining in release builds. When using a try catch block and in the catch is a call to an 'extremely expensive' method, you mighy want to restrict the compiler from inlining the expensive method, otherwise the JIT optimizer will try to inline the expensive method on every call, even if an exception is not thrown.

But that should only be used when the expensive error handler is only used in catch blocks, otherwise every call to that method - even from normal calls - will loose performance.

-simon f
Simon F
Friday, October 26, 2007 3:49:16 PM UTC
nice posting!
Dominik
Friday, October 26, 2007 5:16:57 PM UTC
"they wanted the complete stack trace in their logging after an exception was thrown."

I wonder what a "five whys" exercise would have learned here? I'd guess that we might discover that the code has a tendency to crash a lot in production? So some important testing is actually occurring in the wild?

So we might find we're in "two wrongs don't make a right" territory. Which is as good a description of "antipattern" as any.
Mike Woodhouse
Friday, October 26, 2007 5:59:49 PM UTC
Brilliant presentation of the problem. I appreciate the time you take in research and explaining things to we mere mortals. As always, thank's for saving my butt.
On an interesting corrolary, we were able to remove 1800 instances of this :)

If I could also convince the world that Hungarian Notation in .Net Code is also just as large of a party foul all will be well with the universe.
Friday, October 26, 2007 7:30:48 PM UTC
Building release builds with pdbonly compiler switch would inline and optimise the code with the detail stack trace information containing the line numbers

I blogged about this back while: http://blog.vuscode.com/malovicn/archive/2007/08/05/releasing-the-build.aspx
Friday, October 26, 2007 9:02:55 PM UTC
Scott,

This is all well and good... and I have many times wanted to know if it would be able to get line numbers in the stack trace rather than offsets running with a "release" build. However, what you stated here sounds good, but what do you do for ASP.Net?

In ASP.Net you do not deploy an EXE. Any advice in that case?

BOb
Bob Archer
Monday, October 29, 2007 4:49:32 PM UTC
Brilliant post Scott! A must read for everyone here in the office. Debug, release builds, and optimizations seem to be common topics of confusion and heated political debate and the number of developers I've come across who are "trying to outwit the JITter" is astounding. Thank you for taking the time to develop clear examples and for presenting the argument in an easily digestible (and repeatable) fashion. I shall henceforth quote thee in meetings when this topic arises.
Saturday, November 10, 2007 12:20:59 AM UTC
Hi Scott

Great article. I know it's only example code (but so many people read your blog), but the Framework Design Guidelines (7.3.2) states that you should not throw or derive from System.ApplicationException. ;)

Nice tip about [.NET Framework Debugging Control] in the .ini file.
BTW, the link to Mark Pearce's article seems to be broken.
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.