Scott Hanselman

Fun with Noun Pluralization libraries and the .NET Framework

February 2, '11 Comments [15] Posted in BCL
Sponsored By

Add Entity Pluralization There was interesting discussion about noun pluralization on a mailing list today. One of the fun demos I throw into some talks is the automatic pluralization (taking something singular and making it plural) that's built into the Entity Framework design time tooling. For example, look at the screenshot to the right where a table of type "Goose" is called "Geese" as a set.

Dmitry released a nice "Simple English Noun Pluralizer" a few years back. His code is fun to read and clever for under 400 lines.

In addition to Dmitry's, in .NET 4 (Full Framework, not Client Profile) there's a little known PluralizationService inside of System.Data.Entity.Design that supports this user interface.

At this point there is only support for English and while the class is abstract, there's no proper provider model or factory. Also, personally I think service should be in the Base Class Library (BCL) and should been available as an extension method on string, and I've told them so. That said, it's still pretty cool and useful.

Create a new project in Visual Studio 2010. Right click on the project in the Solution Explorer, and from Properties select ".NET Framework 4" as the Target Framework. Otherwise you won't be able to see System.Data.Entity.Design in the list of .NET Assemblies from the Add Reference dialog. Add a reference, and go to town.

Here's some of the things I tried. I was impressed and confused when it changed Hanselman to Hanselmen. ;)

[Test]
public void AnimalsAreFunWhenPluralized()
{
var p = PluralizationService.CreateService(new CultureInfo("en-US"));
Assert.AreEqual("geese", p.Pluralize("goose"));
Assert.AreEqual("deer", p.Pluralize("deer"));
Assert.AreEqual("sheep", p.Pluralize("sheep"));
Assert.AreEqual("wolves", p.Pluralize("wolf"));
Assert.AreEqual("volcanoes", p.Pluralize("volcano"));
Assert.AreEqual("aircraft", p.Pluralize("aircraft"));
Assert.AreEqual("alumnae", p.Pluralize("alumna"));
Assert.AreEqual("alumni", p.Pluralize("alumnus"));
Assert.AreEqual("houses", p.Pluralize("house"));
Assert.AreEqual("fungi", p.Pluralize("fungus"));
Assert.AreEqual("Hanselmen", p.Pluralize("Hanselman"));
Assert.AreEqual("Hanselman", p.Singularize("Hanselmen"));
}

As I mentioned, If you wanted, you could derive from PluralizationService and create a Spanish or whatever-you-like service and plug it into the Entity Framework. Dmitry could even swap out the included service and broker calls to his own.

Note that this class was really just meant to be used from the EF Tooling, but I'd like to put gentle pressure on the Powers That Be (if this is useful) to put a little more work into it, make it a core thing, and make it work in more languages.

UPDATED: An interesting point from Jon Galloway. If you are using Entity Framework Code First, you can effectively disable the Pluralization Convention for Table Names by, ahem, removing the PluralizingTableNameConvention. Funny how that works, eh?

namespace MvcMusicStore.Models
{
public class MusicStoreEntities : DbContext
{
public DbSet<Album> Albums { get; set; }
public DbSet<Genre> Genres { get; set; }
public DbSet<Artist> Artists { get; set; }
public DbSet<Cart> Carts { get; set; }
public DbSet<Order> Orders { get; set; }
public DbSet<OrderDetail> OrderDetails { get; set; }

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();
}
}
}

Cool.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

The Weekly Source Code 56 - Visual Studio 2010 and .NET Framework 4 Training Kit - Code Contracts, Parallel Framework and COM Interop

August 12, '10 Comments [11] Posted in ASP.NET | ASP.NET Ajax | ASP.NET Dynamic Data | ASP.NET MVC | BCL | Learning .NET | LINQ | OData | Open Source | Programming | Source Code | VB | Web Services | Win7 | Windows Client | WPF
Sponsored By

Do you like a big pile of source code? Well, there is an imperial buttload of source in the Visual Studio 2010 and .NET Framework 4 Training Kit. It's actually a 178 meg download, which is insane. Perhaps start your download now and get it in the morning when you get up. It's extremely well put together and I say Kudos to the folks that did it. They are better people than I.

I like to explore it while watching TV myself and found myself looking through tonight. I checked my blog and while I thought I'd shared this with you before, Dear Reader, I hadn't. My bad, because it's pure gold. With C# and VB, natch.

Here's an outline of what's inside. I've heard of folks setting up lunch-time study groups and going through each section.

C# 4 Visual Basic 10 
F# Parallel Extensions
Windows Communication Foundation Windows Workflow
Windows Presentation Foundation ASP.NET 4
Windows 7 Entity Framework
ADO.NET Data Services (OData) Managed Extensibility Framework
Visual Studio Team System RIA Services
Office Development  

I love using this kit in my talks, and used it a lot in my Lap Around .NET 4 talk.

There's Labs, Presentations, Demos, Labs and links to online Videos. It'll walk you step by step through loads of content and is a great starter if you're getting into what's new in .NET 4.

Here's a few of my favorite bits, and they aren't the parts you hear the marketing folks gabbing about.

Code Contracts

Remember the old coding adage to "Assert Your Expectations?" Well, sometimes Debug.Assert is either inappropriate or cumbersome and what you really need is a method contract. Methods have names and parameters, and those are contracts. Now they can have conditions like "don't even bother calling this method unless userId is greater than or equal to 0 and make sure the result isn't null!

Code Contracts continues to be revised, with a new version out just last month for both 2008 and 2010. The core types that you need are included in mscorlib with .NET 4.0, but you do need to download the tools to see them inside Visual Studio. If you have VS Pro, you'll get runtime checking and VS Ultimate gets that plus static checking. If I have static checking and the tools I'll see a nice new tab in Project Properties:

Code Contracts Properties Tab in Visual Studio

I can even get Blue Squigglies for Contract Violations as seen below.

A blue squigglie showing that a contract isn't satisfied

As a nice coincidence, you can go and download Chapter 15 of Jon Skeet's C# in Depth for free which happens to be on Code Contracts.

Here's a basic idea of what it looks like. If you have static analysis, you'll get squiggles on the lines I've highlighted as they are points where the Contract isn't being fulfilled. Otherwise you'll get a runtime ContractException. Code Contracts are a great tool when used in conjunction with Test Driven Development.

using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics.Contracts;

namespace ContractsDemo
{
[ContractVerification(true)]
class Program
{
static void Main(string[] args)
{
var password = GetPassword(-1);
Console.WriteLine(password.Length);
Console.ReadKey();
}

#region Header
/// <param name="userId">Should be greater than 0</param>
/// <returns>non-null string</returns>
#endregion
static string GetPassword(int userId)
{
Contract.Requires(userId >= 0, "UserId must be");
Contract.Ensures(Contract.Result<string>() != null);

if (userId == 0)
{
// Made some code to log behavior

// User doesn't exist
return null;
}
else if (userId > 0)
{
return "Password";
}

return null;
}
}
}

COM Interop sucks WAY less in .NET 4

I did a lot of COM Interop back in the day and it sucked. It wasn't fun and you always felt when you were leaving managed code and entering COM. You'd have to use Primary Interop Assemblies or PIAs and they were, well, PIAs. I talked about this a little bit last year in Beta 1, but it changed and got simpler in .NET 4 release.

Here's a nice little sample I use from the kit that gets the Processes on your system and then makes a list with LINQ of the big ones, makes a chart in Excel, then pastes the chart into Word.

If you've used Office Automation from managed code before, notice that you can say Range[] now, and not get_range(). You can call COM methods like ChartWizard with named parameters, and without including Type.Missing fifteen times. As an aside, notice also the default parameter value on the method.

static void GenerateChart(bool copyToWord = false)
{
var excel = new Excel.Application();
excel.Visible = true;
excel.Workbooks.Add();

excel.Range["A1"].Value2 = "Process Name";
excel.Range["B1"].Value2 = "Memory Usage";

var processes = Process.GetProcesses()
.OrderByDescending(p => p.WorkingSet64)
.Take(10);
int i = 2;
foreach (var p in processes)
{
excel.Range["A" + i].Value2 = p.ProcessName;
excel.Range["B" + i].Value2 = p.WorkingSet64;
i++;
}

Excel.Range range = excel.Range["A1"];
Excel.Chart chart = (Excel.Chart)excel.ActiveWorkbook.Charts.Add(
After: excel.ActiveSheet);

chart.ChartWizard(Source: range.CurrentRegion,
Title: "Memory Usage in " + Environment.MachineName);

chart.ChartStyle = 45;
chart.CopyPicture(Excel.XlPictureAppearance.xlScreen,
Excel.XlCopyPictureFormat.xlBitmap,
Excel.XlPictureAppearance.xlScreen);

if (copyToWord)
{
var word = new Word.Application();
word.Visible = true;
word.Documents.Add();

word.Selection.Paste();
}
}

You can also embed your PIAs in your assemblies rather than carrying them around and the runtime will use Type Equivalence to figure out that your embedded types are the same types it needs and it'll just work. One less thing to deploy.

Parallel Extensions

The #1 reason, IMHO, to look at .NET 4 is the parallelism. I say this not as a Microsoft Shill, but rather as a dude who owns a 6-core (12 with hyper-threading) processor. My most favorite app in the Training Kit is ContosoAutomotive. It's a little WPF app that loads a few hundred thousand cars into a grid. There's an interface, ICarQuery, that a bunch of plugins implement, and the app foreach's over the CarQueries.

This snippet here uses the new System.Threading.Task stuff and makes a background task. That's all one line there, from StartNew() all the way to the bottom. It says, "do this chunk in the background." and it's a wonderfully natural and fluent interface. It also keeps your UI thread painting so your app doesn't freeze up with that "curtain of not responding" that one sees all the time.

private void RunQueries()
{
this.DisableSearch();
Task.Factory.StartNew(() =>
{
this.BeginTiming();
foreach (var query in this.CarQueries)
{
if (this.searchOperation.Token.IsCancellationRequested)
{
return;
}

query.Run(this.cars, true);
};
this.EndSequentialTiming();
}, this.searchOperation.Token).ContinueWith(_ => this.EnableSearch());
}

StartNew() also has a cancellation token that we check, in case someone clicked Cancel midway through, and there's a ContinueWith at the end that re-enables or disabled Search button.

Here's my system with the queries running. This is all in memory, generating and querying random cars.12% CPU across 12 processors single threaded

And the app says it took 2.3 seconds. OK, what if I do this in parallel, using all the processors?

2.389 seconds serially

Here's the changed code. Now we have a Parallel.ForEach instead. Mostly looks the same.

private void RunQueriesInParallel()
{
this.DisableSearch();
Task.Factory.StartNew(() =>
{
try
{
this.BeginTiming();
var options = new ParallelOptions() { CancellationToken = this.searchOperation.Token };
Parallel.ForEach(this.CarQueries, options, (query) =>
{
query.Run(this.cars, true);
});
this.EndParallelTiming();
}
catch (OperationCanceledException) { /* Do nothing as we cancelled it */ }
}, this.searchOperation.Token).ContinueWith(_ => this.EnableSearch());
}

This code says "go do this in a background thread, and while you're there, parallelize this as you like." This loop is "embarrassingly parallel." It's a big for loop over 2 million cars in memory. No reason it can't be broken apart and made faster.

Here's the deal, though. It was SO fast, that Task Manager didn't update fast enough to show the work. The work was too easy. You can see it used more CPU and that there was a spike of load across 10 of the 12, but the work wasn't enough to peg the processors.

19% load across 12 processors 

Did it even make a difference? Seems it was 5x faster and went from 2.389s to 0.4699 seconds. That's embarrassingly parallel. The team likes to call that "delightfully parallel" but I prefer "you're-an-idiot-for-not-doing-this-in-parallel parallel," but that was rejected.

0.4699 seconds when run in parallel. A 5x speedup.

Let's try something harder. How about a large analysis of Baby Names. How many Roberts born in the state of Washington over a 40 year period from a 500MB database?

Here's the normal single-threaded foreach version in Task Manager:

One processor chilling.

Here's the parallel version using 96% CPU.

6 processes working hard!

And here's the timing. Looks like the difference between 20 seconds and under 4 seconds.

PLINQ Demo

You can try this yourself. Notice the processor slider bar there at the bottom.

ProcessorsToUse.Minimum = 1;
ProcessorsToUse.Maximum = Environment.ProcessorCount;
ProcessorsToUse.Value = Environment.ProcessorCount; // Use all processors.

This sample uses "Parallel LINQ" and here's the two queries. Notice the "WithDegreeofParallelism."

seqQuery = from n in names
where n.Name.Equals(queryInfo.Name, StringComparison.InvariantCultureIgnoreCase) &&
n.State == queryInfo.State &&
n.Year >= yearStart && n.Year <= yearEnd
orderby n.Year ascending
select n;

parQuery = from n in names.AsParallel().WithDegreeOfParallelism(ProcessorsToUse.Value)
where n.Name.Equals(queryInfo.Name, StringComparison.InvariantCultureIgnoreCase) &&
n.State == queryInfo.State &&
n.Year >= yearStart && n.Year <= yearEnd
orderby n.Year ascending
select n;

The .NET 4 Training Kit has Extensibility demos, and Office Demos and SharePoint Demos and Data Access Demos and on and on. It's great fun and it's a classroom in a box. I encourage you to go download it and use it as a teaching tool at your company or school. You could do brown bags, study groups, presentations (there's lots of PPTs), labs and more.

Hope you enjoy it as much as I do.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

Hanselminutes Podcast 196 - .NET 4 CLR, Framework and Language Chat with Jason Olson

February 4, '10 Comments [1] Posted in ASP.NET | BCL | DLR | Learning .NET | Podcast
Sponsored By

image My one-hundred-and-ninety-sixth podcast is up. Jason Olson works (or worked, as you'll hear) for Microsoft in DPE. In this episode he takes Scott a little deeper into some of the new features in .NET 4, including security, CLR changes, C# 4 and VB 10 improvements and the new Task Parallel Library.

Subscribe: Subscribe to Hanselminutes Subscribe to my Podcast in iTunes

Download: MP3 Full Show

Links from the Show

Do also remember the complete archives are always up and they have PDF Transcripts, a little known feature that show up a few weeks after each show.

Telerik is our sponsor for this show.

Check out their UI Suite of controls for ASP.NET. It's very hardcore stuff. One of the things I appreciate aboutTelerik is their commitment tocompleteness. For example, they have a page about their Right-to-Left support while some vendors have zero support, or don't bother testing. They also are committed to XHTML compliance and publish their roadmap. It's nice when your controls vendor is very transparent.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

Enjoy. Who knows what'll happen in the next show?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web

CLR and DLR and BCL, oh my! - Whirlwind Tour around .NET 4 (and Visual Studio 2010) Beta 1

May 20, '09 Comments [18] Posted in BCL | Learning .NET | TechEd
Sponsored By

Just got a great tweet from Jeremiah Morrill about .NET 4.

"Office and COM Interop that is actually fun to do" Show me! Until then, I'm calling shenanigans.

I love a challenge! In my first Whirlwind Tour post on ASP I mentioned how COM Interop and Office Interop was fun with 4.

I've done a lot of COM Interop with C# and a LOT of Office Automation. Once upon a time, I worked at a company called Chrome Data, and we created a Fax Server with a Digiboard. Folks would call into a number, and the person who took the order would pick the make/model/style/year of the car and "instantly" fax a complete report about the vehicle. It used VB3, SQL Server 4.21 and Word 6.0 and a magical thing called "OLE Automation."

Fast forward 15 years and I sent an email to Mads Torgerson, a PM on C# that said:

I’m doing a sample for a friend where I’m simply spinning through an Automation API over a Word Doc to get and change some CustomDocumentProperties.

I’m really surprised at how current C# sucks at this. Of course, it makes sense, given all the IDispatch code in Word, but still. Dim != var as they say. Fix it!

Well, everything except "fix it!" is true. I added that just now. ;) I did a post on this a while back showing how scary the C# code was. This is/was somewhere where Visual Basic truly excels. I vowed to only use VB for Office Automation code after this fiasco.

If you want to melt your brain, check out the old code. No joke. I've collapsed the block because it's too scary. See the "ref missings"? The reflection? The Get/Sets? Scandalous!

{
ApplicationClass WordApp = new ApplicationClass();
WordApp.Visible = true;
object missing = System.Reflection.Missing.Value;
object readOnly = false;
object isVisible = true;
object fileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"..\..\..\NewTest.doc");
Microsoft.Office.Interop.Word.Document aDoc = WordApp.Documents.Open(
ref fileName,ref missing,ref readOnly,ref missing,
ref missing,ref missing,ref missing,ref missing,
ref missing,ref missing,ref missing,ref isVisible,
ref missing,ref missing,ref missing,ref missing);

aDoc.Activate();

string propertyValue = GetCustomPropertyValue(aDoc, "CustomProperty1");
SetCustomPropertyValue(aDoc, "CustomProperty1", "Hanselman");

foreach (Range r in aDoc.StoryRanges)
{
r.Fields.Update();
}
}

public string GetCustomPropertyValue(Document doc, string propertyName)
{
object oDocCustomProps = doc.CustomDocumentProperties;
Type typeDocCustomProps = oDocCustomProps.GetType();
object oCustomProp = typeDocCustomProps.InvokeMember("Item",
BindingFlags.Default |
BindingFlags.GetProperty,
null, oDocCustomProps,
new object[] { propertyName });

Type typePropertyValue = oCustomProp.GetType();
string propertyValue = typePropertyValue.InvokeMember("Value",
BindingFlags.Default |
BindingFlags.GetProperty,
null, oCustomProp,
new object[] { }).ToString();

return propertyValue;
}

public void SetCustomPropertyValue(Document doc, string propertyName, string propertyValue)
{
object oDocCustomProps = doc.CustomDocumentProperties;
Type typeDocCustomProps = oDocCustomProps.GetType();
typeDocCustomProps.InvokeMember("Item",
BindingFlags.Default |
BindingFlags.SetProperty,
null, oDocCustomProps,
new object[] { propertyName, propertyValue });
}

Fast forward to C# under .NET 4.

var WordApp = new ApplicationClass();
WordApp.Visible = true;
string fileName = @"NewTest.doc";
Document aDoc = WordApp.Documents.Open(fileName, ReadOnly: true, Visible: true);
aDoc.Activate();

string propertyValue = aDoc.CustomDocumentProperties["FISHORTNAME"].Value;
aDoc.CustomDocumentProperties["FISHORTNAME"].Value = "HanselBank";
string newPropertyValue = aDoc.CustomDocumentProperties["FISHORTNAME"].Value

foreach (Range r in aDoc.StoryRanges)
{
foreach (Field b in r.Fields)
{
b.Update();
}
}

See how all the crap that was originally reflection gets dispatched dynamically? But that's not even that great an example.

Word and Excel Automation with C# 4

That's just an example from Jonathan Carter and Jason Olson that gets running processes (using LINQ, woot) then makes a chart in Excel, then puts the chart in Word.

using System;
using System.Diagnostics;
using System.Linq;
using Excel = Microsoft.Office.Interop.Excel;
using Word = Microsoft.Office.Interop.Word;

namespace One.SimplifyingYourCodeWithCSharp
{
class Program
{
static void Main(string[] args)
{
GenerateChart(copyToWord: true);
}

static void GenerateChart(bool copyToWord = false)
{
var excel = new Excel.Application();
excel.Visible = true;
excel.Workbooks.Add();

excel.get_Range("A1").Value2 = "Process Name";
excel.get_Range("B1").Value2 = "Memory Usage";

var processes = Process.GetProcesses()
.OrderByDescending(p => p.WorkingSet64)
.Take(10);
int i = 2;
foreach (var p in processes)
{
excel.get_Range("A" + i).Value2 = p.ProcessName;
excel.get_Range("B" + i).Value2 = p.WorkingSet64;
i++;
}

Excel.Range range = excel.get_Range("A1");
Excel.Chart chart = (Excel.Chart)excel.ActiveWorkbook.Charts.Add(
After: excel.ActiveSheet);

chart.ChartWizard(Source: range.CurrentRegion,
Title: "Memory Usage in " + Environment.MachineName);

chart.ChartStyle = 45;
chart.CopyPicture(Excel.XlPictureAppearance.xlScreen,
Excel.XlCopyPictureFormat.xlBitmap,
Excel.XlPictureAppearance.xlScreen);

if (copyToWord)
{
var word = new Word.Application();
word.Visible = true;
word.Documents.Add();

word.Selection.Paste();
}
}
}
}

Notice the named parameters in C#, like 'Title: "whatever"' and "copyToWord: true"?

PIAs no long stand for Pain in the *ss - Type Equivalence and Embedded Interop Assemblies

Primary Interop Assemblies are .NET assemblies that bridge the gap between a .NET app and a COM server. They are also a PIA, ahem. When there's articles about your technology on the web called "Common Pitfalls With ______" you know there's trouble.

Typically you reference these Interop assemblies in Visual Studio and they show up, predictably, as references in your assembly. Here's a screenshot with the project on the right, and the assembly under Reflector on the left.

image

This means that those PIAs better be deployed on the end user's machine. Some PIAs can be as large as 20 megs or more! That's because for each COM interface, struct, enum, etc, there is a managed equivalent for marshalling data. That sucks, especially if I just want to make a chart and I'm not using any other types. It's great that Office has thousands of types, but don't make me carry them all around.

However, now I can click on Properties for these references and click Embed Interop Types = true. Now, check the screenshot. The references are gone and new types have appeared.

 image

Just the types I was using are now embedded within my application. However, since the types are equivalent, the runtime handles this fact and we don't have to do anything.

This all adds up to Office and COM Interop that is actually fun to do. Ok, maybe not fun, but better than a really bad paper cut. Huh, Jeremiah? ;)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Page 1 of 1 in the BCL category

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.