Scott Hanselman

Given I like reading Source Code by the fire with my smoking jacket and brandy snifter, a list of books

April 17, '12 Comments [34] Posted in Open Source | Source Code
Sponsored By

lilwaynesomecodeandsomebrandyJeff had a blog post yesterday (seems everything he writes in his retirement gets on Hacker News as well) about reading source code. While Jeff's post is largely a pull-quote of a post on Hacker News by Brandon Bloom, one bit stuck out to me as I'm sure it did to others.

"The idea that you'd settle down in a deep leather chair with your smoking jacket and a snifter of brandy for a fine evening of reading through someone else's code is absurd." - Jeff Atwood.

Absurd? Hardly. Nearly every programmer I've ever spoken to enjoys reading and discovering new code. I've been advocating that Developers need to read as much code as they write for at least half the time I've been blogging (10 years now, as of yesterday.) How could you not be excited about reading source with all the wonderful open source that's available in the world today?

In fact I have an entire category of my blog called the "Weekly Source Code" with 58 different specific entries at last code. That's 58 different great opportunities to read and learn from another programmer, some good some bad.

The idea that reading source code is absurd is really the wrong message to send. Here's a list of interesting books about source and source code that I'd recommend you settle down in your leather chair, stoke the fire and read.

Of course, you don't need to buy any of these books or pay for anything. Just read code. Read your coworkers code, your company's code, your favorite open source library's code. Don't stop reading code.

The Weekly Source Code was weekly but then become "whenever I get the time." Because of Jeff's article I'm going to get a smoking jacket and brandy snifter and start doing new Weekly Source Code posts every week. Ok, it will be a Code Zero snifter but you get the idea. Because you can't be a good writer coder if you aren't a good reader.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

The Weekly Source Code 59 - An Open Source Treasure: Irony .NET Language Implementation Kit

October 14, '11 Comments [19] Posted in Learning .NET | Open Source | Source Code
Sponsored By

Select Grammars Dialog in Irony filled with grammarsOne of the best, if not the best way to sharpen the saw and keep your software development skills up to date is by reading code. Sure, write lots of code, but don't forget to explore other people's brains code. There's always fifteen different ways to create a "textboxes over data" application, and while it's interesting to take a look at whatever the newest way to make business software, sometimes it's nice to relax by looking at some implementations of classic software issues like parsers, lexers, and abstract syntax trees. If you didn't go to school or failed to take a compilers class at least knowing that this area of software engineering exists and is accessible to you is very important.

It's so nice to discover open source projects that I didn't know existed. One such project I just stumbled upon while doing research for a customer is "Irony," a .NET language implementation kit. From the CodePlex site:

Irony is a development kit for implementing languages on .NET platform. Unlike most existing yacc/lex-style solutions Irony does not employ any scanner or parser code generation from grammar specifications written in a specialized meta-language. In Irony the target language grammar is coded directly in c# using operator overloading to express grammar constructs. Irony's scanner and parser modules use the grammar encoded as c# class to control the parsing process. See the expression grammar sample for an example of grammar definition in c# class, and using it in a working parser.

Irony includes "simplified grammars for C#, Scheme, SQL, GwBasic, JSON and others" to learn from. There are different kinds of parsers that are grammar generators you might be familiar with. For example, ANTLR is a what's called a LL(*) grammar generator, while Irony is a LALR (Look Ahead Left to Right) grammar generator.

Here's a very basic SQL statement for getting a show from my Podcast database:

SELECT ID, Title FROM Shows WHERE ID = 1

Here's the Irony Parse Tree as viewed in the irony Grammar Explorer:

A complete parse tree of the SQL statement with every node expanded

Typically, in my experience, when creating a parser one will use a DSL (Domain Specific Language) like the GOLD Meta Language building on the BNF (Backus-Naur Form) expression grammar. These domain specific languages are tightly optimized to express exactly how a language is structured and how it should be parsed. You learn a language to create languages.

Remember in the Irony intro text earlier? Let me repeat:

Unlike most existing yacc/lex-style solutions Irony does not employ any scanner or parser code generation from grammar specifications written in a specialized meta-language. In Irony the target language grammar is coded directly in c# using operator overloading to express grammar constructs.

What Roman from Irony has done here is use C# language constructs as if it's a DSL. A fluent parser, as it were. So he's using C# classes and methods to express the language grammar. It's a very interesting and powerful idea if you are interested in creating DSLs but not interested in learning other parsers like GOLD. Plus, it's just fun.

The Irony Grammar Explorer

He has a very rich bass class called Grammar that you derive from, like:

[Language("SQL", "89", "SQL 89 grammar")]
public class SqlGrammar : Grammar {
public SqlGrammar() : base(false) { //SQL is case insensitive
...

But instead of a grammar language like this (simplified by me) to express a SQL SELECT Statement:

! =============================================================================
! Select Statement
! =============================================================================
<SELECT Stm> ::= SELECT <COLUMNS> <INTO Clause> <FROM Clause> <WHERE Clause>
<GROUP Clause> <HAVING Clause> <ORDER Clause><COLUMNS> ::= <RESTRICTION> '*' |
<RESTRICTION> <COLUMN List>...snip for clarity...<RESTRICTION> ::= ALL |
DISTINCT |<AGGREGATE> ::= Count '(' '*' ')' | Count '(' <EXPRESSION> ')' |
Avg '(' <EXPRESSION> ')' | Min '(' <EXPRESSION> ')' | Max '(' <EXPRESSION> ')' |
StDev '(' <EXPRESSION> ')' | StDevP '(' <EXPRESSION> ')' | Sum '(' <EXPRESSION> ')' |
Var '(' <EXPRESSION> ')' | VarP '(' <EXPRESSION> ')'<INTO Clause> ::= INTO Id |
<FROM Clause> ::= FROM <ID List> <JOIN Chain><JOIN Chain> ::= <JOIN> <JOIN Chain> |
...snip for clarity...

You'd have something like this instead, again, simplified so this doesn't turn into a giant listing of code rather than a blog post.

//Select stmt
selectStmt.Rule = SELECT + selRestrOpt + selList + intoClauseOpt + fromClauseOpt + whereClauseOpt + groupClauseOpt + havingClauseOpt + orderClauseOpt;
selRestrOpt.Rule = Empty | "ALL" | "DISTINCT";
selList.Rule = columnItemList | "*";
columnItemList.Rule = MakePlusRule(columnItemList, comma, columnItem);
columnItem.Rule = columnSource + aliasOpt;aliasOpt.Rule = Empty | asOpt + Id;
asOpt.Rule = Empty | AS;columnSource.Rule = aggregate | Id;
aggregate.Rule = aggregateName + "(" + aggregateArg + ")";
aggregateArg.Rule = expression | "*";
aggregateName.Rule = COUNT | "Avg" | "Min" | "Max" | "StDev" | "StDevP" | "Sum" | "Var" | "VarP";
intoClauseOpt.Rule = Empty | INTO + Id;fromClauseOpt.Rule = Empty | FROM + idlist + joinChainOpt;
joinChainOpt.Rule = Empty | joinKindOpt + JOIN + idlist + ON + Id + "=" + Id;
joinKindOpt.Rule = Empty | "INNER" | "LEFT" | "RIGHT";
whereClauseOpt.Rule = Empty | "WHERE" + expression;
groupClauseOpt.Rule = Empty | "GROUP" + BY + idlist;
havingClauseOpt.Rule = Empty | "HAVING" + expression;
orderClauseOpt.Rule = Empty | "ORDER" + BY + orderList;

Here the variables and terms that are being use to build the grammar were defined earlier like this, as an example:

var SELECT = ToTerm("SELECT"); var FROM = ToTerm("FROM");var AS = ToTerm("AS"); 

You might immediately declare, Dear Reader, that this is blasphemy!  How can C# compete with a specialized DSL like the BNF? This is a C# shaped peg being shoved into a round hold. Well, maybe, but it's interesting to point out that the SQL GOLD Grammar is 259 lines and the C# version of essentially the same thing is 247 lines. Now, I'm not pointing out line numbers to imply that this is a better way or that this is even a valid 1:1 comparison. But, it's interesting that the C# class is even close. You might have assumed it would be much much larger. I think it's close because Roman, the Irony developer, has a very well factored and specialized base class for the derived class to "lean on." Each of his sample grammars are surprisingly tight.

For example:

  • "Mini" Python - ~140 lines
  • Java - ~130 lines
  • Scheme - ~200 lines
  • JSON - 39 lines

To conclude, here's the JSON grammar generator. 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Irony.Parsing;

namespace Irony.Samples.Json
{
[Language("JSON", "1.0", "JSON data format")]
public class JsonGrammar : Grammar
{
public JsonGrammar()
{
//Terminals
var jstring = new StringLiteral("string", "\"");
var jnumber = new NumberLiteral("number"); var comma = ToTerm(",");

//Nonterminals
var jobject = new NonTerminal("Object");
var jobjectBr = new NonTerminal("ObjectBr");
var jarray = new NonTerminal("Array");
var jarrayBr = new NonTerminal("ArrayBr");
var jvalue = new NonTerminal("Value");
var jprop = new NonTerminal("Property");
//Rules
jvalue.Rule = jstring | jnumber | jobjectBr | jarrayBr | "true" | "false" | "null";
jobjectBr.Rule = "{" + jobject + "}";
jobject.Rule = MakeStarRule(jobject, comma, jprop);
jprop.Rule = jstring + ":" + jvalue;
jarrayBr.Rule = "[" + jarray + "]";
jarray.Rule = MakeStarRule(jarray, comma, jvalue);
//Set grammar root
this.Root = jvalue; MarkPunctuation("{", "}", "[", "]", ":", ",");
this.MarkTransient(jvalue, jarrayBr, jobjectBr);
}
}
}

Pretty clever stuff, and a well put together project and solution that is well structured. I could myself using this in a C# or Compiler class to teach some of these concepts. It's also a great little tool for creating small languages of your own. Perhaps you have a Wiki-dialect that's specific to your company and you want to get rid of all that nasty manual parsing? Or many you have an old custom workflow engine or custom expression system embedded in your application and never got around to changing all your parsing to a proper grammar? Maybe now is the time to get that little language you've been thinking about off the ground!

I encourage you, Dear Reader, to support open source projects like this. Why not go leave a comment today on your favorite open source project's site and just let them know you appreciate what they're doing?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

The Weekly Source Code 58 - Generating (Database) Test Data with AutoPoco and Entity Framework Code First

January 5, '11 Comments [26] Posted in Open Source | Source Code
Sponsored By

I was messing around with Entity Framework Code First for data access recently (what I like to call EF Magic Unicorn) and had a need to create a bunch of test data. Such a chore. It's totally no fun and I always end up slapping the keyboard thinking that someone else should be slapping the keyboard. Test data for 10 instances of a class is easy, but 1000 is some how less inspiring.

Sure, there's lots of software I could buy to solve a problem like that, but meh. I dug around some and found an open source framework called AutoPoco from Rob Ashton (@robashton). First, awesome name. I like using open source projects with cool names. It's kind of like reading a book with a great cover. Makes you feel good jumping in. Then, I started working with it and realized there's some substance to this modest little library. All the better, right?

AutoPoco's CodePlex project says:

AutoPoco is a highly configurable framework for the purpose of fluently building readable test data from Plain Old CLR Objects

Here's the general idea. You need 100 of something (or 5, or whatever) setup a certain way, so you ask AutoPoco to construct a bunch of you, and it's so. It's that easy.

For example, please get me 1000 objects of type SimpleUser and make their FirstName's all "Bob.":

IList<SimpleUser> users = session.List<SimpleUser>(1000).Impose(x => x.FirstName, "Bob").Get();

Here's where it really shines, though:

session.List<SimpleUser>(100)
.First(50)
.Impose(x => x.FirstName, "Rob")
.Impose(x => x.LastName, "Ashton")
.Next(50)
.Impose(x => x.FirstName, "Luke")
.Impose(x => x.LastName, "Smith")
.All().Random(25)
.Impose(x => x.Role,roleOne)
.Next(25)
.Impose(x => x.Role,roleTwo)
.Next(50)
.Impose(x => x.Role, roleThree)
.All()
.Invoke(x => x.SetPassword("Password1"))
.Get();

This says:

Create 100 users
The first 50 of those users will be called Rob Ashton
The last 50 of those users will be called Luke Smith
25 Random users will have RoleOne
A different 25 random users will have RoleTwo
And the other 50 users will have RoleThree
And set the password on every single user to Password1

Effectively, the sky's the limit. You can also give AutoPoco more advanced requirements like 'make emails meet these requirements," or "call this method when it's time to get a password." The idea being not just to make test data, but to make somewhat meaningful test data.

This got me thinking that since we're using POCOs (Plain Ol' CLR Objects) that I could use this not only for Unit Tests but also Integration Tests and Smoke Tests. I could use this to generate test data in a database. All the better to use this with the new Entity Framework stuff that also uses POCOs.

For example:

public void MakeTestData()
{
IGenerationSessionFactory factory = AutoPocoContainer.Configure(x =>
{
x.Conventions(c => { c.UseDefaultConventions(); });
x.AddFromAssemblyContainingType<SimpleUser>();
});

IGenerationSession session = factory.CreateSession();

IList<SimpleUser> users = session.List<SimpleUser>(1000)
.First(500)
.Impose(x => x.FirstName, "Bob")
.Next(500)
.Impose(x => x.FirstName, "Alice")
.All()
.Impose(x => x.LastName, "Hanselman")
.Random(250)
.Impose(x => x.LastName, "Blue")
.All().Random(400)
.Impose(x => x.LastName, "Red")
.All()
.Get();

SimpleUserDatabase db = new SimpleUserDatabase();
foreach (SimpleUser s in users)
{
db.Users.Add(s);
}
db.SaveChanges();
}

And boom, I've got 1000 users in my little database.

I've talked to the author, Rob, and I think that the Session creation and Factory stuff could be made smaller, and the loop at the bottom could be reduced a line or two. Rob's a practical guy, and I look forward to where AutoPoco goes next! All in all, what a useful library. I can see myself using this a lot.

You can get AutoPoco from CodePlex, or even better, from inside Visual Studio using NuGet (http://nuget.org) via "install-package autopoco."

Enjoy! This library has 36 downloads, but deserves a few orders of magnitudes more. I'll do my best to showcase more open source libraries that deserve attention (and more importantly, to be used!) going forward. Feel free to email me suggestions of insanely useful libraries.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

The Weekly Source Code 57 -Controlling an Eagletron TrackerPod with C# 4, ASP.NET MVC and jQuery

November 22, '10 Comments [12] Posted in Channel9 | Hardware | Open Source | Remote Work | Source Code | VS2010
Sponsored By

LifeCam mounted to an Eagletron TrackerPod I have a 42" HDTV in Seattle that's hooked up all the time as an "Embodied Social Proxy." It's a portal between the Microsoft Redmond campus and my house here in Oregon. I've blogged about Embodied Social Proxies before as well as shot video of one for Channel 9. It's called the "HanselCart" around work, although recently it's stopped being a cart and now it's a whole office that folks in Building 42 can drop by and see me, er, the Virtual Me.

One of the things that hasn't been 100% smooth is that while the LifeCam Cinema HD 720p is a nice camera, I can't MOVE it. I have to ask folks to move it for me, which is a slight irritant.

I'm getting ready to head up to Seattle for a meeting. While I was packing I found this TrackerPod motorized WebCam pan/tilt/zoom in my junk closet. I must have purchased it a long time ago and forgotten. I drilled a hole into the metal base of the LifeCam Cinema HD and superglued it while half-threading it on the TrackerPod's standard tripod-style male screw.

It's late, but I figured it I was going up to Seattle tomorrow, maybe I could hack something together quickly with this device and take it with me. There's a Custom Programming API page on the TrackerPod site with a TrackerPod COM Client.

This is what I built in action:

Here's how I did it in 40 minutes. First, I made a new ASP.NET MVC 3 web project, keeping the default template. This is quick and dirty, right?

Yes, I used a <TABLE>, sue me.

image

Here's the complete Razor View. I knew that I'd want a bunch of buttons to move the camera, and I assumed I would use jQuery to make an AJAX to the server side running ASP.NET MVC. I'm using the latest jQuery 1.4.4 and I'm getting it from the updated Microsoft cookieless CDN (Content Delivery Network.)

Rather than making a complex switch statement for the different buttons or different event handlers, I decided to use arbitrary HTML5 data attributes. Each INPUT Button has attributes like data-xvalue and data-yvalue.

There's one Click() handler hooked up to all Buttons. It gets the values of those data attributes, then POSTs the data to the Move method of the Home Controller.

@{
View.Title = "Home Page";
}
<script src="•http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.4.4.min.js"
type="text/javascript"></script>
<table border="0">
<tr>
<td></td><td>
<input type="button" value=" up " name="up"
data-xvalue="0" data-yvalue="-10" data-method="0" />
</td><td></td>
</tr>
<tr>
<td>
<input type="button" value="left" name="left"
data-xvalue="10" data-yvalue="0" data-method="0" />
</td>
<td>
<input type="button" value="home" name="home"
data-xvalue="0" data-yvalue="0" data-method="1" />
</td>
<td>
<input type="button" value="right" name="right"
data-xvalue="-10" data-yvalue="0"
data-method="0" />
</td>
</tr>
<tr>
<td>
</td>
<td>
<input type="button" value="down" name="down"
data-xvalue="0" data-yvalue="10" data-method="0" />
</td>
<td>
</td>
</tr>
</table>
<script type="text/javascript">
//<![CDATA[
$(document).ready(function () {
$('input').click(function (event) {
var target = event.target;
x = $(target).data('xvalue');
y = $(target).data('yvalue');
m = $(target).data('method');
$.post("/Home/Move", { x: x, y: y, method: m });
}
);
});
//]]>
</script>

In the Home Controller, there's a method called Move(int x, int y, int method) where method is the way to move the camera - relative is 0 or absolute is 1. That's part of the camera's calling convention.

using TRACKERPOD_DUAL_COMLib;

namespace TrackerPodWeb.Controllers
{
public class HomeController : Controller
{
private dynamic cam = MvcApplication.myCameraInstance;

[HttpPost]
public void Move(int x, int y, int method)
{
cam.x = x;
cam.y = y;
cam.move_method = method;
cam.move();
}

public ActionResult Index()
{
return View();
}
}
}

See that dynamic object? That was the part that blew me away. I'm so used to COM Interop being a freaking nightmare from .NET that I spent most of the time messing with COM Interfaces and Type Libraries and exceptions when I realized that C# 4 was supposed to fix all that.

Like I've been saying about Razor - "stop thinking about syntax and just use it." - the same applies to COM interop in .NET 4 (Remember that the Visual Basic guys have have this nice experience for years...that's why VB is such a popular business automation language.)

Just use the dynamic keyword and start calling COM methods. Seriously, it just worked. I was copy/pasting code from the TrackerCam's VB6 (yes Visual Basic 6) samples into C#4 and other than a few semicolons, it was working directly!

Here's my Web Application's startup code:

public static dynamic myCameraInstance { get; set; }

protected void Application_Start()
{
//snip the MVC init stuff...
myCameraInstance = new TrackerPod();
myCameraInstance.app_name = "hanselcam";
myCameraInstance.initialize();
}

Here I hang on to the COM object for the camera as a poor man's singleton for use elsewhere. I should probably put guard-code around this to make sure it doesn't disappear or something but it's working so far. It should be a proper singleton I think.

Then I use that instance in my HomeController and call the COM methods in Move(). ASP.NET MVC takes care of the binding from jQuery to the Action Method, and .NET 4, C# and the DLR take care of the call into the COM TrackerCam stuff.

HTML5+jQuery -> ASP.NET MVC -> C# 4 dynamic keyword -> DLR COM Binder -> COM Library = It just works.

There's some HTML5 attributes, five lines of JS here and basically four lines of COM interop on my Move() method.

Now I'll be able to control my Seattle WebCam from Oregon. I may make it so I can control it from the Office Communicator Lync chat client or something. It'd also be nice if someone wrapped up the TrackerPod as a nice C# library and put it on CodePlex.

I'll add that to my ToDo list, or perhaps you will, Dear Reader. ;)

Related Links

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

Back to (Parallel) Basics: Don't Block Your Threads, Make Async I/O Work For You

November 15, '10 Comments [11] Posted in Back to Basics | Learning .NET | Programming | Source Code
Sponsored By

Stephen Toub is one of my favorite folks at Microsoft. I've asked him questions before, sometimes for myself, sometimes on your behalf, Dear Reader, and I've always received thoughtful and well reasoned answered. Because I believe strongly in Jon Udell's "Conserve Your Keystrokes" philosophy, I always try to get great information out to the community, especially when it's been emailed. Remember, when you slap the keyboard and write an epic email to just a few people, there's millions of people out there that miss out. Email less, blog more. More on this in a moment.

TIP: If you're interested in Patterns for Parallel Programming, run, don't walk, and download the FREE and extensive eBook called, yes, you guessed it, Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4. Yes, that title is long, but it feels shorter if you process it in parallel. Seriously, it's free and there's a C# and Visual Basic version. It's brilliant.
Now, if you're REALLY interested in the topic, go get the book Parallel Programming with Microsoft .NET by Stephen Toub, Ade Miller, Colin Campbell, and Ralph Johnson. The complete book as HTML is also hosted here.

I recently noticed a blog post from my friend Steve Smith where he shares some quick sample code to "Verify a List of URLs in C# Asynchronously." As I know Steve wouldn't mind me digging into this, I did. I started by asking Stephen Toub in the Parallel Computing group at Microsoft.

Steve Smith wanted to verify a list of URLs for existence. This is the basic synchronous code:

private static bool RemoteFileExists(string url)
{
try
{
var request = WebRequest.Create(url) as HttpWebRequest;
request.Method = "HEAD";
var response = request.GetResponse() as HttpWebResponse;
return (response.StatusCode == HttpStatusCode.OK);
}
catch
{
return false;
}
}

Then Steve changed the code to be Parallel using the new Parallel features of .NET 4 as Stephen Toub helped me explain in "Back to Parallel Basics" in April.

var linkList = GetLinks();

Action<int> updateLink = i =>
{
UpdateLinkStatus(linkList[i]); //fetch URL and update its status in a shared list
};
Parallel.For(0, linkList.Count, updateLink);

Using Parallel.For is a really great way to introduce some basic naive parallelism into your applications.

I'm no expert in parallelism (I've read a great whitepaper...) but I asked Stephen Toub if this was the best and recommended way to solve this problem. Stephen responded from a plane using (his words) "email compiled and tested" examples. With his permission, I've included a derivation of his response here in this blog post for my own, and possibly your, edification.

From Stephen:

First, it looked like the author was proposing using a parallel loop to handle this.  That's ok, and certainly easy, but that’s the kind of thing you’d only really want to do in a client application and not a server application.  The issue here is that, while easy, it blocks threads; for a client application, having a few more threads that are blocked typically isn’t a big deal; for a server app, though, if for example you were doing this in response to an incoming ASP.NET or WCF service request, you'd be blocking several threads per request, which will greatly hinder scalability.  Still, to get up and running quickly, and if the extra few threads isn’t problematic, this is a fine way to go. 

Assuming you want you "fan out" quickly and easily and it's OK to block a few threads, you can either use a parallel loop, tasks directly, or Stephen's personal favorite, a PLINQ query, e.g. if I have a function "bool ValidateUrl(string url);", I can use PLINQ to process up to N at a time:

bool [] results = (from url in urls.AsParallel() select ValidateUrl(url)).ToArray();

In this example, PLINQ will use up to N threads from the ThreadPool, where N defaults to Environment.ProcessorCount, but you can tack on .WithDegreeOfParallelism(N) after the AsParallel() and provide your own N value.

If Steve was doing this in a console app, which is likely, then as Stephen points out, that's no big deal. You've usually got threads to spare on the client. On the server side, however, you want to avoid blocking threads as much as you can.

A better solution from a scalability perspective, says Stephen, is to take advantage of asynchronous I/O.  When you're calling out across the network, there's no reason (other than convenience) to blocks threads while waiting for the response to come back. Unfortunately, in the past it's been difficult to do this kind of aggregation of async operations.  We' need to rewrite our ValidateUrl method to be something more like:

public void ValidateUrlAsync(string url, Action<string,bool> callback);

where the method returns immediately and later calls back through the provided callback to alert whether a given URL is valid or not.  We'd then change our usage of this to be more like this. Notice the use of using System.Collections.Concurrent.ConcurrentQueue representing a thread-safe first in-first out (FIFO) collection, and CountdownEvent, that represents a synchronization primitive that is signaled when its count reaches zero.

using(var ce = new CountdownEvent(urls.Length))

{
var results = new ConcurrentQueue<Tuple<string,bool>>();

Action callback = (url,valid) =>
{
results.Enqueue(Tuple.Create(url,valid));
ce.Signal();
};

foreach(var url in urls) ValidateUrlAsync(url, callback);

ce.Wait();
}

Assuming ValidateUrlAsync is written to use async, e.g. (you'd really want the following to do better error-handling, but again, this is email-compiled):

public void ValidateUrlAsync(string url, Action<string,bool> callback)
{
var request = (HttpWebRequest)WebRequest.Create(url);
try
{
request.BeginGetResponse(iar =>
{
HttpWebResponse response = null;
try
{
response = (HttpWebResponse)request.EndGetResponse(iar);
callback(url, response.StatusCode == HttpStatusCode.OK);
}
catch { callback(url, false); }
finally { if (response != null) response.Close(); }
}, null);
}
catch { callback(url, false); }
}

This example would then this would end up only blocking the main thread launching all of the requests and then blocking waiting for all of the responses, rather than blocking one thread per request.  With a slight change, we could also make the launcher async, for example:

public static void ValidateUrlsAsync(string [] urls, Action<IEnumerable<Tuple<string,bool>> callback)
{
var ce = new CountdownEvent(urls.Length);
var results = new ConcurrentQueue<Tuple<string,bool>>();
Action callback = (url,valid) =>
{
results.Enqueue(Tuple.Create(url,valid));
if (ce.Signal()) callback(results);
};
foreach(var url in urls) ValidateUrlAsync(url, callback);
}

Still, this is all really complicated, and much more difficult than the original one-liner using PLINQ. 

This is where Tasks and the new Async CTP come in really handy. Imagine that instead of

void ValidateUrlAsync(string url, Action<bool> callback);

we instead had

Task<bool> ValidateUrlAsync(string url);

The Task<bool> being returned is much more composable, and represents the result (both the successful completion case and the exceptional case) of the async operation. 

BETA NOTE: It's not possible to have both ASP.NET MVC 3 and the Async CTP installed at the same time. This is a beta conflict thing, it'll be fixed, I'm sure.

If we had such an operation, and if we had a Task.WhenAll method that took any number of tasks and returned a task to represent them all, then we can easily await all of the results, e.g.

bool [] results = await Task.WhenAll(from url in urls select ValidateUrlAsync(url));

Nice and simple, entirely asynchronous, no blocked threads, etc. 

(Note that in the Async CTP, Task.WhenAll is currently TaskEx.WhenAll, because since it was an out-of-band CTP we couldn't add the static WhenAll method onto Task like we wanted to.)

With the Async CTP and the await keyword, it's also much easier to implement the ValidateUrlAsync method, and to do so with complete support for exception handling (which I didn't do in my previous example, i.e. if something fails, it doesn't communicate why).

public async Task<bool> ValidateUrlAsync(string url)
{
using(var response = (HttpWebResponse)await WebRequest.Create(url).GetResponseAsync())
return response.StatusCode == HttpStatusCode.Ok;
}

Even without the Async CTP, though, it's still possible to implement ValidateUrlAsync with this signature.

Notice the use of System.Threading.Tasks.TaskCompletionSource. From MSDN:

In many scenarios, it is useful to enable a Task (Of(TResult)) to represent an external asynchronous operation. TaskCompletionSource (Of( TResult)) is provided for this purpose. It enables the creation of a task that can be handed out to consumers, and those consumers can use the members of the task as they would any other.

public Task<bool> ValidateUrlAsync(string url)
{
var tcs = new TaskCompletionSource<bool>();
var request = (HttpWebRequest)WebRequest.Create(url);
try
{
request.BeginGetResponse(iar =>
{
HttpWebResponse response = null;
try
{
response = (HttpWebResponse)request.EndGetResponse(iar);
tcs.SetResult(response.StatusCode == HttpStatusCode.OK);
}
catch(Exception exc) { tcs.SetException(exc); }
finally { if (response != null) response.Close(); }
}, null);
}
catch(Exception exc) { tcs.SetException(exc); }
return tsc.Task;

}

So, with this method, even without the Async CTP, we can use existing .NET 4 support to handle this relatively easily:

Task.Factory.ContinueWhenAll(
(from url in urls select ValidateUrlAsync(url)).ToArray(),
completedTasks => { /* do some end task */ });

Now, using just what comes with .NET 4 proper I get the best of all worlds.

Big thanks to Stephen Toub. There's lots of new high- and low-level constructs for Task creation, Threading, and Parallelism in .NET 4. While the naive solution is often the right one, the components we have to work with in .NET 4 (and the even newer ones in the Visual Studio 2010 Async CTP adding the 'await' and 'async' keywords) will give you surprisingly fine-grained control over your multi-threaded parallel systems without a whole lot of code.

Related Links:

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Page 1 of 17 in the Source Code category Next Page

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.