I just had a great one on one coding learning session with a good friend of mine over lunch. He's trying to take his coding skills to the "next level." Just as we plateau when we work out physically, I think we can plateau when coding and solving problems. That one of the reasons I try to read a lot of code to be a better developer and why I started The Weekly Source Code. He said it was OK to write about this as someone else might benefit from our discussion. The code and problem domain have been changed to protect the not-so-innocent.
One of the things that we talked about was that some programmers/coders/developers have just a few tools in their toolbox, namely, if, for, and switch.
I'm not making any judgements about junior devs vs. senior devs. I'm talking about idioms and "vocab." I think that using only if, for and switch is the Computer Programmer equivalent of using "like" in every sentence. Like, you know, like, he was all, like, whatever, and I was like, dude, and he was, like, ewh, and I was like meh, you know?
When speaking English, by no means do I have a William F Buckley, Jr.-sized vocabulary, nor do I believe in being sesquipedal for sesquipedalianism's sake, but there is a certain sparkly joyfulness in selecting the right word for the right situation. I'm consistently impressed when someone can take a wordy paragraph and condense it into a crisp sentence without losing any information.
Refactoring code often brings me the same shiny feeling. Here's a few basic things that my friend and I changed in his application over lunch, turning wordy paragraphs into crisp sentences. Certainly this isn't an exhaustive list of anything, but perhaps it can act as a reminder to myself and others to be mindful and think about solving problems beyond, like, if and for and, like, switch, y'know?
Massives Ifs are Sometimes Maps
He had some code that parsed a ridiculous XML document that came back from a Web Server. That the format of the XML was insane wasn't his fault, to be sure. We all have to parse crap sometimes. He had to check for the existence of a certain value and turn it into an Enum.
if (xmlNode.Attributes["someAttr"].Value.ToLower().IndexOf("fog") >= 0)
{
wt = MyEnum.Fog;
}
if (xmlNode.Attributes["someAttr"].Value.ToLower().IndexOf("haze") >= 0)
{
wt = MyEnum.Haze;
}
if (xmlNode.Attributes["someAttr"].Value.ToLower().IndexOf("dust") >= 0)
{
wt = MyEnum.Haze;
}
if (xmlNode.Attributes["someAttr"].Value.ToLower().IndexOf("rain") >= 0)
{
wt = MyEnum.Rainy;
}
...and this went on for 40+ values. There's a few problems with this.
First, he's using IndexOf() and ToLower() to when he's trying to say "ignoring case, does this string contain this other string?" Using ToLower() for a string comparison is always a bad idea, and not just because of the Turkish i problem (details here, here, here and here). Be sure to check out the Recommendations for Strings in .NET.
We could make this simpler and more generic with a helper method that we'll use later. It expresses what we want to do pretty well. If we were using .NET 3.5 we could make this an extension method, but he's on 2.0.
private static bool ContainsIgnoreCase(string s, string searchingFor)
{
return s.IndexOf(searchingFor, StringComparison.OrdinalIgnoreCase) >= 0;
}
Second, he's indexing into the Attributes collection over and over again, and he's hasn't "else cased" the other ifs, so every indexing into Attributes and every other operation runs every time. He can do the indexing once, pull it out, then check his values.
string ws = xmlNode.Attributes["someAttr"].Value;
if (ContainsIgnoreCase(ws, "cloud"))
wt = MyEnum.Cloudy;
else if (ContainsIgnoreCase(ws, "fog"))
wt = MyEnum.Fog;
else if (ContainsIgnoreCase(ws, "haze"))
wt = MyEnum.Haze;
else if (ContainsIgnoreCase(ws, "dust"))
wt = MyEnum.Dust;
else if (ContainsIgnoreCase(ws, "rain"))
wt = MyEnum.Rainy;
else if (ContainsIgnoreCase(ws, "shower"))
...and again, as a reminder, this goes on for dozens and dozens of lines.
We were talking this through step by step to explain my "from point a to point d" wait of thinking. I tend to skip b and c, so it's useful to show these incremental steps, kind of like showing all your work in Math class when doing long division.
At this point, I pointed out that he was clearly mapping the strings to the enums. Now, if the mapping was direct (and it's not for a variety of horrible many-to-one reasons that are spec-specific as well as that this a "contains" operations rather than a direct equality comparison) he could have parsed the string an enum like this where the last parameter ignores case:
wt = (MyEnum)Enum.Parse(typeof(MyEnum), ws, true);
However, his mapping has numerous exceptions and the XML is messy. Getting one step simpler, I suggested making a map. There's a lot of folks who use Hashtable all the time, as they have for years in .NET 1.1, but haven't realized how lovely Dictionary<TKey, TValue> is.
var stringToMyEnum = new Dictionary<string, MyEnum>()
{
{ "fog", MyEnum.Fog},
{ "haze", MyEnum.Fog},
{ "fred", MyEnum.Funky},
//and on and on
};
foreach (string key in stringToMyEnum.Keys)
{
if (ContainsIgnoreCase(ws, key))
{
wt = stringToMyEnum[key];
break;
}
}
Because his input data is gross, he spins through the Keys collection and calls ContainsIgnoreCase on each key until settling on the right Enum. I suggested he clean up his input data to avoid the for loop, turning the whole operation into a simple mapping. At this point, of course, the Dictionary can be shoved off into some global readonly static resource.
string ws = xmlNode.Attributes["someAttr"].Value;
//preproccessing over ws to tidy up
var stringToMyEnum = new Dictionary<string, MyEnum>()
{
{ "fog", MyEnum.Fog},
{ "haze", MyEnum.Fog},
{ "fred", MyEnum.Funky},
//and on and on
};
wt = stringToMyEnum[key];
When Switches Attack
Often large switches "just happen." What I mean is that one introduces a switch to deal with some uncomfortable and (apparently) unnatural mapping between two things and then it just gets out of hand. They tell themselves they'll come back and fix it soon, but by then it's grown into a hairball.
My buddy had a method that was supposed to remove an icon from his WinForms app. The intent is simple, but the implementation became another mapping between a fairly reasonable Enum that he couldn't control, and a number of named icons that he could.control.
The key here is that he could control the icons and he couldn't easily control the enum (someone else's code, etc). That the mapping was unnatural was an artifact of his design.
The next thing he knew he was embroiled in a switch statement without giving it much thought.
private void RemoveCurrentIcon()
{
switch (CurrentMyEnumIcon)
{
case MyEnum.Cloudy:
CustomIcon1 twc = (CustomIcon1)FindControl("iconCloudy");
if (twc != null)
{
twc.Visible = false;
this.Controls.Remove(twc);
}
break;
case MyEnum.Dust:
CustomIcon2 twd = (CustomIcon2)FindControl("iconHaze");
if (twd != null)
{
twd.Visible = false;
this.Controls.Remove(twd);
}
break;
//etc...for 30+ other case:
There's a few things wrong here, besides the switch is yucky. (Yes, I know there are uses for switches, just not here.) First, it's useful to remember when dealing with a lot of custom classes, in this case CustomIcon1 and CustomIcon2 that they likely have a common ancestor. In this case a common ancestor is Control.
Next, he can control the name of his controls. For all intents in his WinForms app, the Control's name is arbitrary. Because his controls map directly to his Enum, why not literally map them directly?
private void RemoveCurrentIcon(MyEnumIcon CurrentMyEnumIcon)
{
Control c = FindControl("icon" + CurrentMyEnumIcon);
if(c != null)
{
c.Visible = false;
Controls.Remove(c);
}
}
The recognition of the shared base class along with the natural Enum to Control name mapping turns 160 lines of switch into a single helper method.
It's very easy, seductive even, to try to explain to the computer to "do it like this" using if, for and switch. However, using even the basic declarative data structures to describe the shape of the data can help you avoid unnecessary uses of the classic procedural keywords if, for and switch.
FINAL NOTE: We worked for a while and turned and 8100 line long program into 3950 lines. I think we can squeeze another 1500-2000 lines out of it. We did the refactoring with the new Resharper 4.0 and a private daily build of CodeRush/Refactor!Pro. I'll do some writeups on both wonderful tools in the coming weeks.
Hosting By