Scott Hanselman

Back to Basics: Moving beyond for, if and switch

April 26, '12 Comments [72] Posted in Back to Basics
Sponsored By

I visit a lot of customers and look at a lot of code. I also worked with a number of large production code bases in my previous jobs and I see a lot of ifs, fors and switches. I see loops inside of loops with ifs inside them, all doing various transformations of data from one form to another. I see strings getting parsed to pull bits of data out in ways that are easy to say in English but take 100 lines to say in code.

Should they? When we are just getting started programming we learn about if first, then for, then the much abused switch statement.

I saw this little snippet on Miguel's blog a few weeks ago:

var biggerThan10 = new List;
for (int i = 0; i < array.Length; i++){
if (array [i] > 10)
biggerThan10.Add (array[i]);
}

It's straightforward. Take an array of ints and make a new list with those that are larger than 10. We've all see code like this a million times. Here's the same thing in a few other languages.

C#

var a = from x in array where x > 10 select x; 
var b = array.Where(x => x > 10);

Ruby

a = array.select{|x| x >10}

JavaScript

a = array.filter(function(x){return x > 10});

I'd much rather write these one line operations than the loop and if above. I still see this out in the world, so perhaps people haven't seen enough examples. I asked friends on Twitter to submit their examples. Thank you Twitter friends!

Here's a few nice examples. Iron Shay has some nice LINQ examples on his blog. Please do share yours in the comments. Be sure to use <pre> tags.

NOTE: This is NOT about "shoving stuff into one line" but rather looking at solutions that are equally as readable but also simpler, terser, and less error prone than loops of loops.


def calculate_primes(n):
no_primes = []
primes = []

for i in range(2, 8):
for j in range(i*2, n, i):
no_primes.append(j)

for x in range(2, n):
if x not in no_primes:
primes.append(x)

return primes


calculate_primes(500)


# Can be like this instead!

(lambda n: [x for x in range(2, n) if x not in [j for i in range(2, 8) for j in range(i*2, n, i)]])(500)

From Aaron Bassett


foreach (var i in categories) {
foreach (var x in GetAllChildCategories(i.Id)) {
yield return x;
}
}

//Can be...

return categories.SelectMany(i => this.GetAllChildCategoriesIds(i.Id));

From James Hull


var inputNumbersInString = Console.ReadLine();
var inputNumbersStringArray = inputNumbersInString.Split(' ');
var inputNumbers = new List<int>();

for (int i = 0; i < inputNumbersStringArray.Length; ++i) {
inputNumbers.Add(int.Parse(inputNumbersStringArray[i]));
}

int maxNumber = inputNumbers[0];

for (int i = 1; i < inputNumbers.Count; ++i)
if (inputNumbers[i] > maxNumber)
maxNumber = inputNumbers[i];

Console.WriteLine(maxNumber);

//Or rather...

Console.WriteLine(Console.ReadLine().Split(' ').Select(t => int.Parse(t)).ToList().Max());

From Amit Saraswat


// create a poker deck as a list of two characters strings: 
// rank, suite

char[] figures = "23456789TJQKA".ToCharArray();
char[] suites = "SHDC".ToCharArray();
List<string> deck = new List<string>();

foreach (var figure in figures) {
foreach (var suite in suites) {
deck.Add(string.Format("{0}{1}", figure, suite));
}
}

//Or, neatly
var cards = from r in "23456789TJQKA" from s in "SHDC" select "" + r + s;

From Jack Nova


bool include = false;
if (op == Operator.And) {
bool current = true;
foreach (var item in Items) {
current = current & item.Process();
}
include = current;
}
else {
bool current = false;
foreach (var item in Items) {
current = current | item.Process();
}
include = current;
}
return include;

//Or this lovely Aggregate

return op == Operator.And ?
Items.Aggregate(true, (current, item) => current & item.Process()) :
Items.Aggregate(false, (current, item) => current | item.Process());

From Kevin Meiresonne


sbyte[] sByteArray = new sbyte[100];
byte[] uByteArray = new byte[sByteArray.Length];

for (int i = 0; i < sByteArray.Length; i++) {
uByteArray[i] = (byte)sByteArray[i];
}

//Or, instead of the loop above
byte[] uByteArray1 = Array.ConvertAll(sByteArray, x => (byte)x);

From Fahad Mustafa


Scott: I have to say here that I prefer the first option. ;)

// This is the "classic" solution to the FizzBuzz problem.
for (int i = 1; i <= 100; i++) {
if (i % 3 == 0 && i % 5 == 0) {
Console.WriteLine("FizzBuzz");
}
else if (i % 3 == 0) {
Console.WriteLine("Fizz");
}
else if (i % 5 == 0) {
Console.WriteLine("Buzz");
}
else {
Console.WriteLine(i.ToString());
}
}

// One line
Enumerable.Range(1, 100).ToList().ForEach(n => Console.WriteLine((n % 3 == 0) ? (n % 5 == 0) ? "FizzBuzz" : "Fizz" : (n % 5 == 0) ? "Buzz" : n.ToString()));

From Craig Phillips


A good one...I'm surprised more people don't use this.

var temp = String.Empty;
foreach (var entry in myStringList) {
    if (String.IsNullOrEmpty(temp)) {
        temp = entry;
    }
    else {
        entry += ", " + entry;
    }
}

//becomes

var temp = String.Join(", ", myStringList)

From Holger Adam


A class with properties in one line of F#. That'd be a dozen or more lines of C#.

type Person = { Name:string; Age:int }

From Phillip Trelford


/// Input is a string with numbers : 10+20+30+40
/// Output is integer with required sum (100)
string input = "10+20+30+40";
var result = Regex.Split(input, @"\D+").Select(t => int.Parse(t)).Sum();
Console.WriteLine("Result is {0}" ,result);

From Srinivas Iyengar


There are a million things available to the programmer beyond the first three keywords we learn. What are your favorite patterns (doesn't matter what language) that have helped you break away from the basics and move to the next level?


Sponsor: Big thanks to the folks at DevExpress for sponsoring this week's feed. Check out a free trial of CodeRush, one of my favorite products! Introducing CodeRush by DevExpress. The Visual Studio add-in that helps you create more reliable applications. Tools to build & maintain your code without getting in the way of your IDE.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

Back to Basics: Dynamic Image Generation, ASP.NET Controllers, Routing, IHttpHandlers, and runAllManagedModulesForAllRequests

April 7, '12 Comments [33] Posted in ASP.NET | ASP.NET MVC | Back to Basics | Learning .NET
Sponsored By

Warning, this is long but full of info. Read it all.

Often folks want to dynamically generate stuff with ASP.NET. The want to dynamically generate PDFs, GIFs, PNGs, CSVs, and lots more. It's easy to do this, but there's a few things to be aware of if you want to keep things as simple and scalable as possible.

You need to think about the whole pipeline as any HTTP request comes in. The goal is to have just the minimum number of things run to do the job effectively and securely, but you also need to think about "who sees the URL and when."

A timeline representation of the ASP.NET pipeline

 

This diagram isn't meant to be exhaustive, but rather give a general sense of when things happen.

Modules can see any request if they are plugged into the pipeline. There are native modules written in C++ and managed modules written in .NET. Managed modules are run anytime a URL ends up being processed by ASP.NET or if "RAMMFAR" is turned on.

RAMMFAR means "runAllManagedModulesForAllRequests" and refers to this optional setting in your web.config.

<system.webServer>
<modules runAllManagedModulesForAllRequests="true" />
</system.webServer>

You want to avoid having this option turned on if your configuration and architecture can handle it. This does exactly what it says. All managed modules will run for all requests. That means *.* folks. PNGs, PDFs, everything including static files ends up getting seen by ASP.NET and the full pipeline. If you can let IIS handle a request before ASP.NET sees it, that's better.

Remember that the key to scaling is to do as little as possible. You can certainly make a foo.aspx in ASP.NET Web Forms page and have it dynamically generate a graphic, but there's some non-zero amount of overhead involved in the creation of the page and its lifecycle. You can make a MyImageController in ASP.NET MVC but there's some overhead in the Routing that chopped up the URL and decided to route it to the Controller. You can create just an HttpHandler or ashx. The result in all these cases is that an image gets generated but if you can get in and get out as fast as you can it'll be better for everyone. You can route the HttpHandler with ASP.NET Routing or plug it into web.config directly.

Works But...Dynamic Images with RAMMFAR and ASP.NET MVC

A customer wrote me who was using ASP.NET Routing (which is an HttpModule) and a custom routing handler to generate images like this:

routes.Add(new Route("images/mvcproducts/{ProductName}/default.png", 
new CustomPNGRouteHandler()));

Then they have a IRouteHandler that just delegates to an HttpHandler anyway:

public class CustomPNGRouteHandler : IRouteHandler
{
public System.Web.IHttpHandler GetHttpHandler(RequestContext requestContext)
{
return new CustomPNGHandler(requestContext);
}
}

Note the {ProductName} route data in the route there. The customer wants to be able to put anything in that bit. if I visit http://localhost:9999/images/mvcproducts/myproductname/default.png I see this image...

A dynamically generated PNG from ASP.NET Routing, routed to an IHttpHandler

Generated from this simple HttpHandler:

public class CustomPNGHandler : IHttpHandler
{
public bool IsReusable { get { return false; } }
protected RequestContext RequestContext { get; set; }

public CustomPNGHandler():base(){}

public CustomPNGHandler(RequestContext requestContext)
{
this.RequestContext = requestContext;
}

public void ProcessRequest(HttpContext context)
{
using (var rectangleFont = new Font("Arial", 14, FontStyle.Bold))
using (var bitmap = new Bitmap(320, 110, PixelFormat.Format24bppRgb))
using (var g = Graphics.FromImage(bitmap))
{
g.SmoothingMode = SmoothingMode.AntiAlias;
var backgroundColor = Color.Bisque;
g.Clear(backgroundColor);
g.DrawString("This PNG was totally generated", rectangleFont, SystemBrushes.WindowText, new PointF(10, 40));
context.Response.ContentType = "image/png";
bitmap.Save(context.Response.OutputStream, ImageFormat.Png);
}
}
}

The benefits of using MVC is that handler is integrated into your routing table. The bad thing is that doing this simple thing requires RAMMFAR to be on. Every module sees every request now so you can generate your graphic. Did you want that side effect? The bold is to make you pay attention, not scare you. But you do need to know what changes you're making that might affect the whole application pipeline.

(As an aside, if you're a big site doing dynamic images, you really should have your images on their own cookieless subdomain in the cloud somewhere with lots of caching, but that's another article).

So routing to an HttpHandler (or an MVC Controller) is an OK solution but it's worth exploring to see if there's an easier way that would involve fewer moving parts. In this case the they really want the file to have the extension *.png rather than *.aspx (page) or *.ashx (handler) as it they believe it affects their image's SEO in Google Image search.

Better: Custom HttpHandlers

Remember that HttpHandlers are targeted to a specific path, file or wildcard and HttpModules are always watching. Why not use an HttpHandler directly and plug it in at the web.config level and set runAllManagedModulesForAllRequests="false"?

<system.webServer>
<handlers>
<add name="pngs" verb="*" path="images/handlerproducts/*/default.png"
type="DynamicPNGs.CustomPNGHandler, DynamicPNGs" preCondition="managedHandler"/>
</handlers>
<modules runAllManagedModulesForAllRequests="false" />
</system.webServer>

Note how I have a * there in part of the URL? Let's try hitting http://localhost:37865/images/handlerproducts/myproductname/default.png. It still works.

A dynamically generated PNG from an ASP.NET IHttpHandler

This lets us not only completely bypass the managed ASP.NET Routing system but also remove RAMMFAR so fewer modules are involved for other requests. By default, managed modules will only run for requests that ended up mapped to the managed pipeline and that's almost always requests with an extension. You may need to be aware of routing if you have a "greedy route" that might try to get ahold of your URL. You might want an IgnoreRoute. You also need to be aware of modules earlier in the process that have a greedy BeginRequest.

The customer could setup ASP.NET and IIS to route request for *.png to ASP.NET, but why not be as specific as possible so that the minimum number of requests is routed through the managed pipeline? Don't do more work than you need to.

What about extensionless URLs?

Getting extensionless URLs working on IIS6 was tricky before and lots of been written on it. Early on in IIS6 and ASP.NET MVC you'd map everything *.* to managed code. ASP.NET Routing used to require RAMFARR set to true until the Extensionless URL feature was created.

Extentionless URLs support was added in this KB http://support.microsoft.com/kb/980368 and ships with ASP.NET MVC 4. If you have ASP.NET MVC 4, you have Extentionless URLs on your development machine. But your server may not. You may need to install this hotfix, or turn on RAMMFAR. I would rather you install the update than turn on RAMMFAR if you can avoid it. The Run All Modules options is really a wildcard mapping.

Extensionless URLs exists so you can have URLs like /home/about and not /home/about.aspx. It exists to get URLs without extensions to be seen be the managed pipelines while URLs with extensions are not seen any differently. The performance benefits of Extensionless URLs over RAMMFAR are significant.

If you have static files like CSS, JS and PNG files you really want those to be handled by IIS (and HTTP.SYS) for speed. Don't let your static files get mapped to ASP.NET if you can avoid it.

Conclusion

When you're considering any solution within the ASP.NET stack (or "One ASP.NET" as I like to call it)...

The complete ASP.NET stack with MVC, Web Pages, Web Forms and more called out in a stack of boxes

...remember that it's things like IHttpHandler that sit at the bottom and serve one request (everything comes from IHttpHandler) while it's IHttpModule that's always watching and can see every request.

In other words, and HttpHandler sees the ExecuteRequestHandler event which is just one event in the pipeline, while HttpModules can see every event they subscribe to.

HttpHandlers and Modules are at the bottom of the stack

I hope this helps!


Sponsor: Thank you to my friends at Axosoft for sponsoring the Hanselman feed this week. Do check out their product! Imagine agile project management software that is brilliantly easy to use, blazingly fast, totally customizable, and just $7 per user. With OnTime Scrum, you won't have to imagine. Get started free.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

Back to Basics: Daylight Savings Time bugs strike again with SetLastModified

November 6, '11 Comments [32] Posted in ASP.NET | Back to Basics | Learning .NET
Sponsored By

CC BY-NC 2.0 Creative Commons Clock Photo via Flickr ©Thomas Hawk No matter how well you know a topic, or a codebase, it's never to late (or early) to get nailed by a latest bug a over a half-decade old.

DasBlog, the ASP.NET 2 blog engine that powers this blog, is done. It's not dead, but it's done. It's very stable. We had some commits last year, and I committed a bug fix in February, but it's really well understood and very baked. My blog hasn't been down for traffic spike reasons in literally years as DasBlog scales nicely on a single machine.

It was 10:51pm PDT (that's Pacific Daylight Time) and I was writing a blog post about the clocks in my house, given that PST (that's Pacific Standard Time) was switching over soon. I wrote it up in Windows Live Writer, posted it to my blog, then hit Hanselman.com to check it out.

Bam. 404.

What? 404? Nonsense. Refresh.

404.

*heart in chest* Have I been hacked? What's going on? OK, to the logs!

l2    time    2011-11-06T05:36:31    code    1    message    Error:System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values.
Parameter name: utcDate
at System.Web.HttpCacnhePolicy.UtcSetLastModified(DateTime utcDate)
at System.Web.HttpCachePolicy.SetLastModified(DateTime date)
at newtelligence.DasBlog.Web.Core.SiteUtilities.GetStatusNotModified(DateTime latest) in C:\dev\DasBlog\source\newtelligence.DasBlog.Web.Core\SiteUtilities.cs:line 1253
at newtelligence.DasBlog.Web.Core.SharedBasePage.NotModified(EntryCollection entryCollection) in C:\dev\DasBlog\source\newtelligence.DasBlog.Web.Core\SharedBasePage.cs:line 1182
at newtelligence.DasBlog.Web.Core.SharedBasePage.Page_Load(Object sender, EventArgs e) in C:\dev\DasBlog\source\newtelligence.DasBlog.Web.Core\SharedBasePage.cs:line 1213
at System.EventHandler.Invoke(Object sender, EventArgs e)
at System.Web.UI.Control.OnLoad(EventArgs e)
at System.Web.UI.Control.LoadRecursive()
at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) while processing http://www.hanselman.com/blog/default.aspx.

What's going on? Out of range? What's out of range. Ok, my site is down for the first time in years. I must have messed it up with my clock post. I'll delete that. OK, delete. Whew.

Refresh.

404.

WHAT?!?

Logs, same error, now the file is a meg and growing, as this messages is happening of hundreds of times a minute. OK, to the code!

UtcSetLastModified is used for setting cache-specific HTTP headers and for controlling the ASP.NET page output cache. It lets me tell HTTP that something hasn't been modified since a certain time. I've got a utility that figures out which post was the last modified or most recently had comments modified, then I tell the home page, then the browser, so everyone can decide if there is fresh content or not.

public DateTime GetLatestModifedEntryDateTime(IBlogDataService dataService, EntryCollection entries)
{
//figure out if send a 304 Not Modified or not...
return latest //the lastTime anything interesting happened.
}

In the BasePage we ask ourselves, can we avoid work and give a 304?

//Can we get away with an "if-not-modified" header?
if (SiteUtilities.GetStatusNotModified(SiteUtilities.GetLatestModifedEntryDateTime(dataService, entryCollection)))
{
//snip
}

However, note that I'm have to call SetLastModified though. Seems that UtcSetLastModified is private. (Why?) When I call SetLastModified it does this:

public void SetLastModified(DateTime date)
{
DateTime utcDate = DateTimeUtil.ConvertToUniversalTime(date);
this.UtcSetLastModified(utcDate);
}

Um, OK. Lame. So that means I have to work in local time. I retrieve dates and convert them ToLocalTime().

At this point, you might say, Oh, I get it, he's called ToLocalTime() too many times and double converted his times. That's what I thought. However, after .NET 2 that is possible.

The value returned by the conversion is a DateTime whose Kind property always returns Local. Consequently, a valid result is returned even if ToLocalTime is applied repeatedly to the same DateTime.

But. We originally wrote DasBlog in .NET 1.1 first and MOVED it to .NET 2 some years later. I suspect that I'm actually counting on some incorrect behavior deep in own our (Clemens Vasters and mine) TimeZone and Data Access code that worked with that latent incorrect behavior (overconverting DateTimes to local time) and now that's not happening. And hasn't been happening for four years.

Hopefully you can see where this is going.

It seems a comment came in around 5:36am GMT or 10:36pm PDT which is 1:36am EST. That become the new Last Modified Date. At some point we an hour was added in conversion as PDT wasn't PST yet but EDT was EST.

Your brain exploded yet? Hate Daylight Saving Time? Ya, me too.

Anyway, that DateTime became 2:36am EST rather than 1:36am. Problem is, 2:36am EST is/was the future as 6:46 GMT hadn't happened yet.

A sloppy 5 year old bug that has been happening for an hour each year that was likely always there but counted on 10 year old framework code that was fixed 7 years ago. Got Unit Tests for DST? I don't.

My server is in the future, but actually not as far in the future as it usually is. My server in on the East Coast and it was 1:51am. However, the reasons my posts sometimes look like they are from the future, is I store everything in the neutral UTC/GMT zone, so it was 5:51am the next day on my file system.

Moral of the story?

I need to confirm that my server is on GMT time and that none of my storage code is affected my Daylight Saving Time.

Phrased differently, don't use DateTime.Now for ANY date calculations or to store anything. Use DateTime.UTCNow and be aware that some methods will freak out if you send them future dates, as they should. Avoid doing ANYTHING in local time until that last second when you show the DateTime to the user.

In my case, in the nine minutes it took to debug this, it resolved itself. The future became the present and the future last modified DateTime became valid. Is there a bug? There sure it, at least, there is for an hour, once a year. Now the real question is, do I fix it and possibly break something that works the other 8759 hours in a year. Hm, that IS still four 9's of uptime. (Ya, I know I need to fix it.)

"My code has no bugs, it runs exactly as it was written." - Some famous programmer

Until next year. ;)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

Back to Basics: Big O notation issues with older .NET code and improving for loops with LINQ deferred execution

September 15, '11 Comments [58] Posted in Back to Basics | Learning .NET | LINQ
Sponsored By

Earlier today Brad Wilson and I were discussing a G+ post by Jostein Kjønigsen where he says, "see if you can spot the O(n^2) bug in this code."

public IEnumerable<Process> GetProcessesForSession(string processName, int sessionId)
{
var processes = Process.GetProcessByName(processName);
var filtered = from p in processes
where p.SessionId == sessionId
select p;
return filtered;
}

This is a pretty straightforward method that calls a .NET BCL (Base Class Library) method and filters the result with LINQ. Of course, when any function calls another one that you can't see inside (which is basically always) you've lost control. We have no idea what's going on in GetProcessesByName.

Let's look at the source to the .NET Framework method in Reflector. Our method calls Process.GetProcessesByName(string).

public static Process[] GetProcessesByName(string processName)
{
return GetProcessesByName(processName, ".");
}

Looks like this one is an overload that passes "." into the next method Process.GetProcessesByName(string, string) where the second parameter is the machineName.

This next one gets all the processes for a machine (in our case, the local machine) then spins through them doing a compare on each one in order to build a result array to return up the chain.

public static Process[] GetProcessesByName(string processName, string machineName)
{
if (processName == null)
{
processName = string.Empty;
}
Process[] processes = GetProcesses(machineName);
ArrayList list = new ArrayList();
for (int i = 0; i < processes.Length; i++)
{
if (string.Equals(processName, processes[i].ProcessName, StringComparison.OrdinalIgnoreCase))
{
list.Add(processes[i]);
}
}
Process[] array = new Process[list.Count];
list.CopyTo(array, 0);
return array;
}

if we look inside GetProcesses(string), it's another loop. This is getting close to where .NET calls Win32 and as these classes are internal there's not much I can do to fix this function other than totally rewrite the internal implementation. However, I think I've illustrated that we've got at least two loops here, and more likely three or four.

public static Process[] GetProcesses(string machineName)
{
bool isRemoteMachine = ProcessManager.IsRemoteMachine(machineName);
ProcessInfo[] processInfos = ProcessManager.GetProcessInfos(machineName);
Process[] processArray = new Process[processInfos.Length];
for (int i = 0; i < processInfos.Length; i++)
{
ProcessInfo processInfo = processInfos[i];
processArray[i] = new Process(machineName, isRemoteMachine, processInfo.processId, processInfo);
}
return processArray;
}

This code is really typical of .NET circa 2002-2003 (not to mention Java, C++ and Pascal). Functions return arrays of stuff and other functions higher up filter and sort.

When using this .NET API and for looping over the results several times, I'm going for(), for(), for() in a chain, like O(4n) here.

Note: To be clear, it can be argued that O(4n) is just O(n), cause it is. Adding a number like I am isn't part of the O notation. I'm just saying we want to avoid O(cn) situations where c is a large enough number to affect perf.

image

Sometimes you'll see nested for()s like this, so O(n^3) here where things get messy fast.

Squares inside squares inside squares representing nested fors

LINQ is more significant than people really realize, I think. When it first came out some folks said "is that all?" I think that's unfortunate. LINQ and the concept of "deferred execution" is just so powerful but I think a number of .NET programmers just haven't taken the time to get their heads around the concept.

Here's a simple example juxtaposing spinning through a list vs. using yield. The array version is doing all the work up front, while the yield version can calculate. Imagine a GetFibonacci() method. A yield version could calculate values "just in time" and yield them, while an array version would have to pre-calculate and pre-allocate.

public void Consumer()
{
foreach (int i in IntegersList()) {
Console.WriteLine(i.ToString());
}

foreach (int i in IntegersYield()) {
Console.WriteLine(i.ToString());
}
}

public IEnumerable<int> IntegersYield()
{
yield return 1;
yield return 2;
yield return 4;
yield return 8;
yield return 16;
yield return 16777216;
}

public IEnumerable<int> IntegersList()
{
return new int[] { 1, 2, 4, 8, 16, 16777216 };
}

Back to our GetProcess example. There's two issues at play here.

First, the underlying implementation where GetProcessesInfos eventually gets called is a bummer but it's that way because of how P/Invoke works and how the underlying Win32 API returns the data we need. It would certainly be nice if the underlying API was more granular. But that's less interesting to me than the larger meta-issue of a having (or in this case, not having) a LINQ-friendly API.

The second and more interesting issue (in my option) is the idea that the 2002-era .NET Base Class Library isn't really setup for LINQ-friendliness. None of the APIs return LINQ-friendly stuff or IEnumerable<anything> so that when you change together filters and filters of filters of arrays you end up with O(cn) issues as opposed to nice deferred LINQ chains.

When you find yourself returning arrays of arrays of arrays of other stuff while looping and filtering and sorting, you'll want to be aware of what's going on and consider that you might be looping inefficiently and it might be time for LINQ and deferred execution.

image

Here's a simple conversion attempt to change the first implementation from this classic "Array/List" style:

ArrayList list = new ArrayList();
for (int i = 0; i < processes.Length; i++)
{
if (string.Equals(processName, processes[i].ProcessName, StringComparison.OrdinalIgnoreCase))
{
list.Add(processes[i]);
}
}
Process[] array = new Process[list.Count];
list.CopyTo(array, 0);
return array;

To this more LINQy way. Note that returning from a LINQ query defers execution as LINQ is chainable. We want to assemble a chain of sorting and filtering operations and execute them ONCE rather than for()ing over many lists many times.

if (processName == null) { processName = string.Empty; }

Process[] processes = Process.GetProcesses(machineName); //stop here...can't go farther?

return from p in processes
where String.Equals(p.ProcessName, processName, StringComparison.OrdinalIgnoreCase)
select p; //the value of the LINQ expression being returned is an IEnumerable<Process> object that uses "yield return" under the hood
Here's the whole thing in a sample program.
static void Main(string[] args)
{
var myList = GetProcessesForSession("chrome.exe", 1);
}

public static IEnumerable<Process> GetProcessesForSession(string processName, int sessionID)
{
//var processes = Process.GetProcessesByName(processName);
var processes = HanselGetProcessesByName(processName); //my LINQy implementation
var filtered = from p in processes
where p.SessionId == sessionID
select p;
return filtered;
}

private static IEnumerable<Process> HanselGetProcessesByName(string processName)
{
return HanselGetProcessesByName(processName, ".");
}

private static IEnumerable<Process> HanselGetProcessesByName(string processName, string machineName)
{
if (processName == null)
{
processName = string.Empty;
}
Process[] processes = Process.GetProcesses(machineName); //can't refactor farther because of internals.

//"the value of the LINQ expression being returned is an IEnumerable<Process> object that uses "yield return" under the hood" (thanks Mehrdad!)

return from p in processes where String.Equals(p.ProcessName == processName, StringComparison.OrdinalIgnoreCase) select p;

/* the stuff above replaces the stuff below */
//ArrayList list = new ArrayList();
//for (int i = 0; i < processes.Length; i++)
//{
// if (string.Equals(processName, processes[i].ProcessName, StringComparison.OrdinalIgnoreCase))
// {
// list.Add(processes[i]);
// }
//}
//Process[] array = new Process[list.Count];
//list.CopyTo(array, 0);
//return array;
}

This is a really interesting topic to me and I'm interested in your opinion as well, Dear Reader. As parts of the .NET Framework are being extended to include support for asynchronous operations, I'm wondering if there are other places in the BCL that should be updated to be more LINQ friendly. Or, perhaps it's not an issue at all.

Your thoughts?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

Good Exception Management Rules of Thumb - Back to Basics Edition

March 23, '11 Comments [58] Posted in Back to Basics | Learning .NET
Sponsored By

Almost five years ago I posted this "Good Exception Management Rules of Thumb" and had some interesting and useful comments. As with all lists of rules of thumbs, they may no longer be valid. What do you think?

Cori Drew commented to me in an email that she often sees code like this in the wild (changed to protect the innocent 3rd parties). This kind of code inevitably shows up in a file called Utils.cs, which may be your first clue there's trouble.

public void HandleException(String strMessage)
{
//Log to trace file only if app settings are true
if (Properties.Settings.Default.TraceLogging == true)
{
try
{
TraceSource objTrace = new TraceSource("YadaYadaTraceSource");
objTrace.TraceEvent(TraceEventType.Information, 5, strMessage.ToUpper());
objTrace.Flush();
objTrace.Close();
}
catch
{
//do nothing if there was an error
}
}
throw new Exception(Environment.NewLine + strMessage);
}

What's wrong with this code, Dear Reader?

There's a number of things that we can learn from, so have at it in the comments in nice lists bulleted with asterisk, won't you? You can comment on lines, or the larger strategy or both. I'll update the post with a roll-up once you come up for breath.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Previous Page Page 2 of 7 in the Back to Basics category Next Page

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.