Scott Hanselman

A Smarter (or Pure Evil) ToString with Extension Methods

April 27, '08 Comments [24] Posted in Programming
Sponsored By

Three years ago I postulated about a ToString implementation for C# that seemed useful to me and a few days later I threw it out on the blog. We used in at my old company for a number of things.

Just now I realized that it'd be even more useful (or more evil) with Extension Methods, so I opened up the old project, threw in some MbUnit tests and changed my implementation to use extension methods.

So, like bad take-out food, here it is again, updated as ToString() overloads:

[Test]
public void MakeSimplePersonFormattedStringWithDoubleFormatted()
{
Person p = new Person();
string foo = p.ToString("{Money:C} {LastName}, {ScottName} {BirthDate}");
Assert.AreEqual("$3.43 Hanselman, {ScottName} 1/22/1974 12:00:00 AM", foo);
}

[Test]
public void MakeSimplePersonFormattedStringWithDoubleFormattedInHongKong()
{
Person p = new Person();
string foo = p.ToString("{Money:C} {LastName}, {ScottName} {BirthDate}",new System.Globalization.CultureInfo("zh-hk"));
Assert.AreEqual("HK$3.43 Hanselman, {ScottName} 1/22/1974 12:00:00 AM", foo);
}

It's moderately well covered for all of the hour it took to write it, and I find it useful.

image

Here's all it is.

public static class FormattableObject 
{
public static string ToString(this object anObject, string aFormat)
{
return FormattableObject.ToString(anObject, aFormat, null);
}

public static string ToString(this object anObject, string aFormat, IFormatProvider formatProvider)
{
StringBuilder sb = new StringBuilder();
Type type = anObject.GetType();
Regex reg = new Regex(@"({)([^}]+)(})",RegexOptions.IgnoreCase);
MatchCollection mc = reg.Matches(aFormat);
int startIndex = 0;
foreach(Match m in mc)
{
Group g = m.Groups[2]; //it's second in the match between { and }
int length = g.Index - startIndex -1;
sb.Append(aFormat.Substring(startIndex,length));

string toGet = String.Empty;
string toFormat = String.Empty;
int formatIndex = g.Value.IndexOf(":"); //formatting would be to the right of a :
if (formatIndex == -1) //no formatting, no worries
{
toGet = g.Value;
}
else //pickup the formatting
{
toGet = g.Value.Substring(0,formatIndex);
toFormat = g.Value.Substring(formatIndex+1);
}

//first try properties
PropertyInfo retrievedProperty = type.GetProperty(toGet);
Type retrievedType = null;
object retrievedObject = null;
if(retrievedProperty != null)
{
retrievedType = retrievedProperty.PropertyType;
retrievedObject = retrievedProperty.GetValue(anObject,null);
}
else //try fields
{
FieldInfo retrievedField = type.GetField(toGet);
if (retrievedField != null)
{
retrievedType = retrievedField.FieldType;
retrievedObject = retrievedField.GetValue(anObject);
}
}

if (retrievedType != null ) //Cool, we found something
{
string result = String.Empty;
if(toFormat == String.Empty) //no format info
{
result = retrievedType.InvokeMember("ToString",
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.InvokeMethod | BindingFlags.IgnoreCase
,null,retrievedObject,null) as string;
}
else //format info
{
result = retrievedType.InvokeMember("ToString",
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.InvokeMethod | BindingFlags.IgnoreCase
,null,retrievedObject,new object[]{toFormat,formatProvider}) as string;
}
sb.Append(result);
}
else //didn't find a property with that name, so be gracious and put it back
{
sb.Append("{");
sb.Append(g.Value);
sb.Append("}");
}
startIndex = g.Index + g.Length +1 ;
}
if (startIndex < aFormat.Length) //include the rest (end) of the string
{
sb.Append(aFormat.Substring(startIndex));
}
return sb.ToString();
}
}

Enjoy.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Greatest Hits

April 27, '08 Comments [12] Posted in Musings
Sponsored By

Greatest Hits I've officially been blogging for six years this week. Yikes! It didn't all suck, but it was pretty rough there early on and I've finally found my blog's heartbeat.

Here's a little about me if you're interested, but more importantly, here's some posts that are the most often-read according to the last several year's web server logs. If you enjoy them, consider subscribing via RSS or email. I also do a weekly podcast that you might like. It's available on iTunes and Zune and you can get the complete MP3 here. I publish an Ultimate Tools List each year at http://www.hanselman.com/tools that you might like.

Blogging

Twitter

Programming

.NET

Gadgets

Africa, Travel and Language

Books and Music

Diabetes

Babies and Children

Family

General Coolness

Staying Organized and Home Office

Reviews

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hack: Parallel MSBuilds from within the Visual Studio IDE

April 26, '08 Comments [21] Posted in ASP.NET | MSBuild | Programming | Tools
Sponsored By

There were a number of interesting comments from my post on Faster Builds with MSBuild using Parallel Builds and Multicore CPUs.

Here's a few good ones, and some answers. First,

Is the developer that sits inside of VS all day debugging able to take advantage of the same build performance improvements?

Yes, I think. This post is about my own hack to make it work.

@Scott, there are still people using VS2005 and .Net 2.0 (our employers are either too cheap to upgrade or we work in non-profit[my case]) So it would be nice for posts like this to clearly denote that these are things only available in the 3.0/3.5+ world. Or should I just assume that .Net 2.0 is dead?

Certainly 2.0 is not dead, considering that 3.5 is still built on the the 2.0 CLR. Additionally, if you're a 2.0 shop, there's really no reason not to upgrade to VS2008 and get all the IDE improvements while still building for .NET 2.0SP1 as MSBUild supports multitargeting.

However, the underlying question here is, "why are you teasing me?" as this gentleman clearly wants faster builds on .NET 2.0. Well, this trick under VS2008 still works when building 2.0 stuff.

If you're using a Solution File (SLN) it's automatic, and if you're using custom MSBuild files, you can build .NET 2.0 stuff with the 3.5 MSBuild by putting

<TargetFrameworkVersion>v2.0</TargetFrameworkVersion>

under a <PropertyGroup> in your C# project. Then you can use /m and get multiproc builds. Not much changes though, because either way your build output is running on CLR2.0. (.NET 3.0 and .NET 3.5 use CLR2.0 underneath.) You will get some warnings if you try to use v3.0/v3.5 types.

It's true that parallel managed code compilation lags behind unmanaged.

Welcome, msbuild, to the 21st century. ;-) GNU make has the -j switch (the equivalent to /m here) _at least_ since the early-to-mid 1990's. We use its mingw port to drive Visual Studio 2003's commandline interface to build vcproj files and to handle dependencies, and since we got multi-core build machines, our build times have improved tremendously (we're talking about roughly 2,5 million SLOCs).

There's a number of reasons for that. For example, Visual Basic is compiling in the background all the time while you're typing in VS using a non-reentrant in-proc compiler. The in-proc C# compiler is the same way.

The Multicore MSBuild with VS Hack

This trick is unsupported because it hasn’t been tested. Off the top of our collective head, I can’t think of why it wouldn’t work OK, at least if your solution is all VB/C#. Having VC projects in your solution, especially if there are project references with the VB/C# projects, would probably not work so well. That’s because VC projects are not native MSBuild format in VS2008.

To start with, this is a totally obvious hack. We'll add an External Tool to Visual Studio to call MSBuild, then add some Toolbar buttons.

First, Tools | External Tools. We'll add MSBuild, pointing it to the v3.5 .NET Framework folder. Then, we'll add /m to the arguments for multi-proc and /v:m (where the v is verbosity and the m is minimal) to tone down MSBuild's chattiness. Finally, we'll check the Use Output Window box and set the initial directory to the current Solution's Directory.

Adding MSBuild to External Tools

Next, we'll right click on the Toolbar, and hit Customize.

External Tools in Visual Studio

Then, from the Commands tab, drag the number of the newly added External Tool (mine is #3) to the Toolbar. I also added a little Lode Runner icon to the button. Additionally, I added a Clean Solution option, just because I like it handy.

Parallel MSBuild Button in Visual Studio

You can also re-map Ctrl-Shift-B if it makes you happy, but I used Ctrl-Shift-M, mapping it to the Tools.ExternalCommand, again, in my case it is #3.

image

At this point, I can now run MSBuild out-of-proc from inside Visual Studio.

What DOESN'T Work

The only thing that doesn't work for me so far is that the errors reported by MSBuild don't appear neatly parsed in the Error List. For some, this may be a deal breaker. However, the good news is that if you double click on the errors in the Output Window you WILL get taken automatically to the right file and line number which is all I personally care about. Still, sucks.

image 

Gotchas for this Hack

If you're building against a SLN file you can use Project References or manually specified project dependencies otherwise MSBuild has no way to figure out the dependencies and you might get "file locked" or other weird concurrency errors.

Also, even if Project to Project references are set up, sometimes project authors are a bit sloppy (I've done this) and inadvertently access the same file in multiple projects, which can cause spurious file locking/access errors. If you see this behavior, turn off multiproc.

Note also that while you can do builds with MSBuild like this, if you get F5 and start a debug session the VS in-process compilers will do a quick build. It WILL figure out that it doesn't need to do anything, but it will check.

Hey, it's NOT faster for me?

If you have a small solution (<10 projects) and/or your solution is designed such that each project "lines up" in a dependency chain rather than a dependency tree, then there will be less parallelism possible.

If you're using a small number of projects, or you're using Visual Basic on small projects, which is compiling in the background anyway, you might not get big changes. Additionally, remember that we're shelling out n new instances of MSBuild, while VS compiles using the in-process versions of the languages compilers. With a small number of projects, in-proc single-core might be faster.

I've got 900 Projects and Visual Studio is Slow, what's the deal?

"Doctor, doctor, it's hurts what I do that!"

"Then don't do that"

Seriously, yes, there are limits. Personally, if you have more than 50 or so projects, it's time to break them out into sub-system SLN files, and keep the über-solution just for the build server.

The team is aware that if you have hundreds of projects in a solution VS totally breaks down. I, and many others, have made suggestions of things like a "tree of projects" where you could build from one point down, and there's ways you can simulate this with multiple-configurations, but it's kinda lame right now if you have a buttload of projects.

Here's a post I did three years ago called How do you organize your code? and I think it still mostly works today. I was using NAnt at the time, but the same concepts apply to MSBuild. Forgive me now, as I quote myself:

Every 'subsystem' has a folder, and usually (not always) a Subsystem is a bigger namespace. In this picture there's "Common" and it might have 5 projects under it. There's a peer Common.Test. Under CodeGeneration, for example, there's 14 projects. Seven are sub-projects within the CodeGeneration subsystem, and seven are the NUnit tests for each. The reason that you see Common and Common.Test at this stuff is that they are the 'highest." But the pattern follows as you traverse the depths. Each subsystem has it's own NAnt build file, and these are used by the primary default.build.

This is not only a nicer way to lay things out for the project, but also for the people on the project, most of whom are not making cross-subsystem changes.

Action Item for you, Dear Reader

If you have feedback then email the MSBuild team at msbuild@ (you know the company name) and they will listen. I warned them you might be emailing, so be nice. If you DEEPLY want multi-proc builds in the next version of Visual Studio, flood them with your (kind) requests. They are aware of ALL these issues and really want to make things better.

It's interesting as I've been at MSFT about 7 months now to realize that for something like multi-proc builds to work it involves (all) the Compiler Teams, the VS Tooling Teams, and the MSBuild folks. Each has to agree it's important and work towards the goal. Sometimes this is obvious and easy, and other times, less so.

Thanks for Dan Moseley, Cliff Hudson and Cullen Waters for their help!

Related Links

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hanselminutes Podcast 109 - ASP.NET Dynamic Data with Scott Hunter

April 26, '08 Comments [2] Posted in ASP.NET | ASP.NET Dynamic Data | Podcast
Sponsored By

My one-hundred-and-ninth podcast is up. Recorded at the Microsoft MVP Summit, this show is about the new ASP.NET Dynamic Data subsystem that's being added to ASP.NET. I sit down with Scott Hunter, a Program Manager for ASP.NET (everyone in the division is required to be named Scott, BTW ;) ) and dig deeper into Dynamic Data.

Subscribe: Subscribe to Hanselminutes Subscribe to my Podcast in iTunes

If you have trouble downloading, or your download is slow, do try the torrent with µtorrent or another BitTorrent Downloader.

Do also remember the complete archives are always up and they have PDF Transcripts, a little known feature that show up a few weeks after each show.

Telerik is our sponsor for this show.

Check out their UI Suite of controls for ASP.NET. It's very hardcore stuff. One of the things I appreciate about Telerik is their commitment to completeness. For example, they have a page about their Right-to-Left support while some vendors have zero support, or don't bother testing. They also are committed to XHTML compliance and publish their roadmap. It's nice when your controls vendor is very transparent.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

Enjoy. Who knows what'll happen in the next show?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Faster Builds with MSBuild using Parallel Builds and Multicore CPUs

April 24, '08 Comments [38] Posted in ASP.NET | Programming | Tools
Sponsored By

UPDATE: I've written an UPDATE on how to get MSBuild building using multiple cores from within Visual Studio. You might check that out when you're done here.

Jeff asked the question "Should All Developers Have Manycore CPUs?" this week. There are number of things in his post I disagree with.

First, "dual-core CPUs protect you from badly written software" is just a specious statement as it's the OS's job to protect you, it shouldn't matter how many cores there are.

Second, "In my opinion, quad-core CPUs are still a waste of electricity unless you're putting them in a server" is also silly. Considering that modern CPUs slow down when not being used, and use minimal electricity when compared to your desk lamp and monitors, I can't see not buying the best (and most) processors) that I can afford. The same goes with memory. Buy as much as you can comfortably afford. No one ever regretted having more memory, a faster CPU and a large hard drive.

Third he says,"there are only a handful of applications that can truly benefit from more than 2 CPU cores" but of course, if you're running a handful of applications, you can benefit even if they are not multi-threaded. Just yesterday I was rendering a DVD, watching Arrested Development, compiling an application, reading email while my system was being backed up by Home Server. This isn't an unreasonable amount of multitasking, IMHO, and this is why I have a quad-proc machine.

That said, the limits to which a machine can multi-task are often limited to the bottleneck that sits between the chair and keyboard. Jeff, of course, must realize this, so I'm just taking issue with his phrasing more than anything.

He does add the disclaimer, which is totally valid: "All I wanted to do here is encourage people to make an informed decision in selecting a CPU" and that can't be a bad thing.

MSBuild

Now, enough picking on Jeff, let's talk about my reality as a .NET Developer and a concrete reason I care about multi-core CPUs. Jeff compiled SharpDevelop using 2 cores and said "I see nothing here that indicates any kind of possible managed code compilation time performance improvement from moving to more than 2 cores."

When I compiled SharpDevelop via "MSBuild SharpDevelop.sln" (which uses one core) it took 11 seconds:

TotalMilliseconds : 11207.7979

Adding the /m:2 parameter to MSBuild yielded a 35% speed up:

TotalMilliseconds : 7190.3041

And adding /m:4 yielded (from 1 core) a a 59% speed up:

TotalMilliseconds : 4581.4157

Certainly when doing a command line build, why WOULDN'T I want to use all my CPUs? I can detect how many there are using an Environment Variable that is set automatically:

C:>echo %NUMBER_OF_PROCESSORS%
4

But if I just say /m to MSBuild like

MSBuild /m

It will automatically use all the cores on the system to create that many MSBuild processes in a pool as seen in this Task Manager screenshot:

Four MSBuild Processes

The MSBuild team calls these "nodes" because they are cooperating and act as a pool, building projects as fast as they can to the point of being I/O bound. You'll notice that their PIDs (Process IDs) won't change while they are living in memory. This means they are recycled, saving startup time over running MSBuild over and over (which you wouldn't want to do, but I've seen in the wild.)

You might wonder, why do we not just use one multithreaded process for MSBuild? Because each building project wants its own current directory (and potentially custom tasks expect this) and each PROCESS can only have one current directory, no matter how many threads exist.

When you run MSBuild on a SLN (Solution File) (which is NOT an MSBuild file) then MSBuild will create a "sln.cache" file that IS an MSBuild file.

Some folks like to custom craft their MSBuild files and others like to get the auto-generate one. Regardless, when you're calling an MSBuild task, one of the options that gets set is (from an auto-generated file):

<MSBuild Condition="@(BuildLevel1) != ''" Projects="@(BuildLevel1)" Properties="Configuration=%(Configuration); Platform=%(Platform); ...snip... BuildInParallel="true" UnloadProjectsOnCompletion="$(UnloadProjectsOnCompletion)" UseResultsCache="$(UseResultsCache)"> ...

When you indicate BuildInParallel you're asking for parallelism in building your Projects. It doesn't cause Task-level parallelism as that would require a task dependency tree and you could get some tricky problems as copies, etc, happened simultaneously.

However, Projects DO often have dependency on each other and the SLN file captures that. If you're using a Visual Studio Solution and you've used Project References, you've already given the system enough information to know which projects to build first, and which to wait on.

More Granularity (if needed)

If you are custom-crafting your MSBuild files, you could turn off parallelism on just certain MSBuild tasks by adding:

BuildInParallel=$(BuildInParallel)

to specific MSBuild Tasks and then just those sub-projects wouldn't build in parallel if you passed in a property from the command line:

MSBuild /m:4 /p:BuildInParallel=false

But this an edge case as far as I'm concerned.

How does BuildInParallel relate to the MSBuild /m Switch?

Certainly, if you've got a lot of projects that are mostly independent of each other, you'll get more speed up than if your solution's dependency graph is just a queue of one project depending on another all the way down the line.

In conclusion, BuildInParallel allows the MSBuild task to process the list of projects which were passed to it in a parallel fashion, while /m tells MSBuild how many processes it is allowed to start.

If you have multiple cores, you should be using this feature on big builds from the command line and on your build servers.

Thanks to Chris Mann and Dan Mosley for their help and corrections.

Related Links

Technorati Tags: ,,

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.