Scott Hanselman

Xml and the Nametable

March 7, '06 Comments [6] Posted in XmlSerializer
Sponsored By

I got a number (~dozen) of emails about by use of the Nametable in my XmlReader post recently. Charles Cook tried it out and noticed about a 10% speedup. I also received a number of poo-poo emails that said "use XPath" or "don't bother" and "the performance is good enough."

Sure, if that works for you, that's great. Of course, always measure before you make broad statements. That said, here's a broad statement. Using an XmlReader will always be faster than the DOM and/or XmlSerializer. Always.

Why? Because what do you think is underneath the DOM and inside of XmlSerialization? An XmlReader of course.

For documents larger than about 50k, you're looking at least one order of magnitude faster when plucking a single value out. When grabbing dozens, it increases.

Moshe is correct in his pointing out that a nice middle-place perf-wise is the XPathReader (for a certain subset of XPath). There's a number of nice XmlReader implementations that fill the space between XmlTextReader and XPathDocument by providing more-than-XmlReader functionality:

BTW, I would also point out that an XmlReader is what I call a "cursor-based pull implementation." While it's similar to the SAX parsers in that it exposes the infoset rather than the angle brackets, it's not SAX.

Now, all that said, what was the deal with my Nametable usage? Charles explains it well, but I will expand. You can do this if you like:

XmlTextReader tr =

   new XmlTextReader("http://feeds.feedburner.com/ScottHanselman");

while (tr.Read()) 

{

    if (tr.NodeType == XmlNodeType.Element && tr.LocalName == "enclosure")

    {

        while (tr.MoveToNextAttribute())

        {

            Console.WriteLine(String.Format("{0}:{1}",

               tr.LocalName, tr.Value));

        }

    }

}

The line in red does a string compare as you look at each element. Not a big deal, but it adds up over hundreds or thousands of executions when spinning through a large document.

The NameTable is used by XmlDocument, XmlReader(s), XPathNavigator, and XmlSchemaCollection. It's a table that maps a string to an object reference. This is called "atomization" - meaning we want to think about atom (think small). If they see "enclosure" more than once, they use the object reference rather than have n number of "enclosure" strings internally.

It's not exactly like a Hashtable, as the NameTable will return the object reference if the string has already been atomized.

XmlTextReader tr =

   new XmlTextReader("http://feeds.feedburner.com/ScottHanselman");

object enclosure = tr.NameTable.Add("enclosure");

while (tr.Read())

{

    if (tr.NodeType == XmlNodeType.Element &&

        Object.ReferenceEquals(tr.LocalName, enclosure))

    {

        while (tr.MoveToNextAttribute())

        {

            Console.WriteLine(String.Format("{0}:{1}",

               tr.LocalName, tr.Value));

        }

    }

}

The easiest way, IMHO, to think about it is this:

  • If you know that you're going to look for an element or attribute with a specific name within any System.Xml class that has an XmlNameTable, preload or warn the parser that you'll be watching for these names.
  • When you do a comparison between the current element or attribute and your target, use Object.ReferenceEquals. Instead of a string comparison, you'll just be asking "are these the same object" - which is about the fastest thing that the CLR can do.
    • Yes, you can use == rather than Object.ReferenceEquals, but the later makes it totally clear what your intent is, while the former is more vague.

This kind of optimization makes a big perf difference (~10% depending) when using an XmlReader. It makes less of one when using an XPathDocument because you are using Select(ing)Nodes in a loop.

Stealing Charles' words: "...because it involves very little extra code it is perhaps an optimization worth making prematurely."

Even the designers agree: "...using the XmlNameTable gives you enough of a performance benefit to make it worthwhile especially if your processing starts to spans multiple XML components in a piplelining scenario and the XmlNameTable is shared across them i.e. XmlTextReader->XmlDocument->XslTransform."

Oleg laments: "...that something needs to be done to fix this particular usage pattern of XmlReader to not ignore great NameTable idea."

Conclusion: The NameTable is there for a reason, no matter what System.Xml solution you use. This is a the correct and useful pattern and not using it is just silly. If you're going to develop a habit, why not make it a best-practice-habit?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Continous Glucose Monitoring

March 4, '06 Comments [7] Posted in Diabetes
Sponsored By

CGM InsertionSometimes my diabetes really gets to me. This is year twelve for me and I have no (known) complications and I'd like to keep it that way. I do pretty well, with blood sugars averaging around 130-160. Ideal is 100, but I'm not doing to bad. However, I had some Indian food last night and I was up until 4AM "chasing blood sugars." I even took an intra-muscular insulin shot in an attempt to bring it down. It can be very tedious.

TGMS_unit_RTCH_thThe insulin pump is nice, but folks often forget that it's just a delivery device. It pumps insulin through a tube into me, and that's it. All the input comes from the blood sugar meter via a finger stick.

However, very soon I should be able to get a Continuous Glucose Monitoring System. This would be yet another device that'd be 'implanted/stuck' to me, but it would talk wirelessly and continuously to the pump.

This device is rolled out in seven cities; they are apparently taking it slow. I can't wait. I have no words to explain to you, dear reader, what it feels like to prick your finger 6 to 10 times a day for 365 days a year for over a decade. You get so addicted (in a necessary way) to the feedback provided by the number that is your blood sugar. Your blood sugar's current level becomes a sixth sense that is as important as any of the other five.

Every time I prick my finger it costs about 80 US cents. It gets spendy. Sometimes I get a little Mulderesque and wonder if they will ever cure diabetes as it's so profitable. Getting my blood sugar reader 10 times a day isn't enough. If you refer back to my Diabetes Airplane Analogy, you wouldn't want to check the altimeter in your airplane only ten times. you'd want to check it continuously.

This continuous meter will connect to me on the other side of my body - the opposite side than the pump - and talk to the pump wirelessly. I'd still have to make the decisions and "close the loop." NONE of this happens automatically. Insulin is never delivered without me deciding. Getting the BG (Blood Glucose) reading continuously will make my life easier.

Here's a little (I think) exclusive. I got this from a "source"...it's a PDF version of the training manual for the new Continuous Meter:

File Attachment: Paradigm Real teaching 1.pdf (999 KB)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

XmlTextReader more and more

March 4, '06 Comments [9] Posted in XML | Web Services
Sponsored By

Random thought: I like the whole XmlReader philosophy. I use it much more often than XmlDocument. I haven't made an XmlDocument in a while. Every once in a while an XmlDocument shows up when you need an XmlNode for some SOAP stuff, but for the most part, I like XmlReaders.

Someone wanted a chunk of code that grabbed RSS Enclosures from a feed. They didn't care about the content, they just wanted the enclosures' attributes. Here's what I sent them 2 mins later.

Sure this code could have been done with XmlDocument.SelectNodes (and I'm sure one of you will show me how) but without getting to much into premature optimization, I know that using an XmlReader will always give me better performance. Always. If use it for little one-off stuff like this, I know when I need real performance for a real app, the usage will be fresh in my mind.

using System;

using System.Xml;

 

namespace ConsoleApplication1

{

    class Program

    {

        static void Main(string[] args)

        {

            XmlTextReader tr =
               new XmlTextReader("http://feeds.feedburner.com/ScottHanselman");

            object enclosure = tr.NameTable.Add("enclosure");

            while (tr.Read())

            {

                if (tr.NodeType == XmlNodeType.Element &&

                    Object.ReferenceEquals(tr.LocalName, enclosure))

                {

                    while (tr.MoveToNextAttribute())

                    {

                        Console.WriteLine(String.Format("{0}:{1}",
                           tr.LocalName, tr.Value));

                    }

                }

 

            }

        }

    }

}

Now playing: Rent - Rent

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

NUnit Expansion Templates in CodeRush

March 1, '06 Comments [3] Posted in NUnit | XML | CodeRush
Sponsored By

CodeRushNunitTemplatesI noticed via Larkware that Scott Bellware had created a series of Visual Studio 2005 Code Snippets for NUnit. Very cool, I said. However, we're not all using Visual Studio 2005 at my company. Much of our bread and butter is 2003/.NET 1.1. But, many of us have CodeRush. Ah! I said, I should duplicate ScottB's work as Code Rush templates!

I rushed into the Templates section of CodeRush only to notice that they are already there! Damn you, Mark Miller and your forethought!

Anyway, I looked at Scott's List and added a few from his, and there's a few built into Mark's that aren't in Scott's, blah blah blah, union, blah blah, intersection, and here's the Code Rush Templates file if you're a Rushie and want to import them. This file includes the whole NUnit folder with my few changes.
File Attachment: CSharp_NUnit.xml (100 KB)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hanselminutes Podcast 8

March 1, '06 Comments [1] Posted in Podcast | ASP.NET | XML | Tools
Sponsored By

HanselminutesMy eighth Podcast is up. This episode is about a few useful VS.NET tools and some interesting websites. I'll talk more about the many other tools that are available in future shows.

We're listed in the iTunes Podcast Directory, so I encourage you to subscribe with a single click (two in Firefox) with the button below. For those of you on slower connections there are lo-fi and torrent-based versions as well.

Subscribe to my Podcast in iTunes

Our sponsors are Automated QA, PeterBlum and the .NET Dev Journal.

Do take a look at TestComplete from Automated QA. It integrates with Visual Studio 2005 and I'm going to try to get a formal review of their stuff probably week after next, particularly their functional Web Testing and Recording.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

  • Each show will include a number of links, and all those links will be posted along with the show on the site. There were 15 sites mentioned in this eighth episode, some planned, some not. We're still using Shrinkster.com on this show.
  • The basic MP3 feed is here, and the iPod friendly one is here. There's a number of other ways you can get it (streaming, straight download, etc) that are all up on the site just below the fold. I use iTunes, myself, to listen to most podcasts, but I also use FeedDemon and it's built in support.
  • Note that for now, because of bandwidth constraints, the feeds always have just the current show. If you want to get an old show (and because many Podcasting Clients aren't smart enough to not download the file more than once) you can always find them at http://www.hanselminutes.com.
  • I have, and will, also include the enclosures to this feed you're reading, so if you're already subscribed to ComputerZen and you're not interested in cluttering your life with another feed, you have the choice to get the 'cast as well.
  • If there's a topic you'd like to hear, perhaps one that is better spoken than presented on a blog, or a great tool you can't live without, contact me and I'll get it in the queue!

Enjoy. Who knows what'll happen in the next show?

Now playing: Ricky Gervais, Steve Merchant, and Karl Pilkington - Ricky Gervais Show: Season 2, Episode 1

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.