Scott Hanselman

Get namespaces from an XML Document with XPathDocument and LINQ to XML

January 17, '08 Comments [9] Posted in ASP.NET | LINQ | Programming | XML
Sponsored By

A fellow emailed me earlier asking how to get the namespaces from an XML document, but he was having trouble because the XML had some XML declarations like <?foo?>.

A System.Xml Way

XPathDocument has two cool methods, GetNamespace(localName) and GetNamespaceInScope, but they need a currentNode to work with.

 string s = @"<?mso-infoPathSolution blah=""blah""?>
              <?mso-application progid=""InfoPath.Document"" versionProgid=""InfoPath.Document.2""?>
              <my:ICS203 xml:lang=""en-US"" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
           xmlns:my=""http://schemas.microsoft.com/office/infopath/2003/myXSD/2007-04-03T19:03:38""            
xmlns:xd=""http://schemas.microsoft.com/office/infopath/2003""> <my:HeaderData/></my:ICS203>"; XPathDocument x = new XPathDocument(new StringReader(s)); XPathNavigator foo = x.CreateNavigator(); foo.MoveToFollowing(XPathNodeType.Element); IDictionary<string, string> whatever = foo.GetNamespacesInScope(XmlNamespaceScope.All);

Once you're on the right note, in this case the first element, you can call GetNamespacesInScope and get a nice dictionary that has what you need inside it.

namespaces

I really like the System.Xml APIs, they make me happy.

A System.Xml + LINQ to XML Bridge Methods Way

How could we do this with the LINQ to XML namespace? Well, pretty much the same way with a much nicer first line (yes, this could be made smaller).

 XDocument y = XDocument.Parse(s);
 XPathNavigator poo = y.CreateNavigator();
 poo.MoveToFollowing(XPathNodeType.Element);
 IDictionary<string, string> dude = foo.GetNamespacesInScope(XmlNamespaceScope.All);

Notice that the CreateNavigator hanging off of XDocument is actually an extension method that is there because we included the System.Xml.XPath namespace. There are a whole series of "bridge" methods that make moving between LINQ to XML APIs and System.Xml APIs seamless.

image

See the (extension) there in the tooltip? There's also a different icon for extension methods when they show up in Intellisense. See the small blue-arrow added next to CreateNavigator?

image

These helper methods are "spot-welded" on to existing object instances when you import a namespace that defines them. They are also called 'mixins.'

A Purely LINQ to XML Way

I also wanted to see how this could be done using LINQ to XML proper.

 Disclaimer: We are comparing Apples and Oranges here, so say, "wow that query is not as terse or compact as GetNamespacesInScope." We're comparing one layer of abstraction to a lower one. We could certainly make a mixin for XElements called GetNamespacesInScope and we'd be back where we started. The System.Xml method GetNamespacesInScope is hiding all the hard work.

Big thanks to Ion Vasilian for setting me straight with this LINQ to XML Query!

First we load the XML into an XDocument and ask for the attributes hanging off the root, but we just want namespace declarations.

XDocument z = XDocument.Parse(s);
var result = z.Root.Attributes().
        Where(a => a.IsNamespaceDeclaration).
        GroupBy(a => a.Name.Namespace == XNamespace.None ? String.Empty : a.Name.LocalName,
                a => XNamespace.Get(a.Value)).
        ToDictionary(g => g.Key, 
                     g => g.First());

Then we group them by namespace. Note the ternary operator ?: that returns "" for no namespace, else the namespaces local name as the key selector, and then gets an actual XNamespace.

Update: Ion wrote me and pointed out a mistake. I was calling z.Root.AncestorsAndSelf.Attributes, and I only needed to call z.Root.Attributes, or if I wanted to get all namespaces, z.Root.DescendantsAndSelf(). Thanks Ion!

Ion says: "z.Root.AncestorsAndSelf() says: from the root element of the document find all ancestors and the element itself. In other words only the root element. If you want to find the in-scope namespace declarations for a given element ‘e’, then on that element you’ll do e.AncestorsAndSelf(). In other words, starting from the given element ‘e’ walking up the ancestors path and including the element itself look for attributes that are namespace declarations and build a dictionary … Note that the question for in-scope namespace declarations is answered by walking up a path in the tree and not by doing a full traversal of the tree starting from a given point (a la e.DescendantsAndSelf())."

It can be confusing to figure out the various types of these variables like "a" and "g". In this example, "a" is an XAttribute because the call to Attributes() is of IEnumerable<XAttribute> and g is of type IGrouping<string, XNamespace>, gleaned from the expression inside of GroupBy().

We finish it off by taking the IGrouping and turning it into a dictionary with ToDictionary, selecting an appropriate key and the first At runtime "g" is an instance of System.Linq.Lookup<string, XNamespace>.Grouping which implements IGrouping, containing the namespace as an (and the only) element, subsequently retrieved with a call to First() and becomes the value side of the dictionary item.

image

Do also note one subtle detail. The System.Xml call to GetNamespacesInScope always includes the xmlns:xml namespace, declared implicitly. The LINQ query doesn't include this implicit namespace. Note also that these were sourced from XML Attributes and that the order of attributes is undefined (another way to say this is that attributes have no order.)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hanselminutes Podcast available on Zune

January 17, '08 Comments [7] Posted in ASP.NET | Microsoft | Podcast
Sponsored By

Finally, the Podcast is available in the Zune Marketplace. There's apparently a very long queue to get approved for the marketplace, so it's nice that it's finally done. We've increased the size of the logo graphic so it'll look optimal on players that support embedded hi-res cover art as well.

Click to subscribe to Hanselminutes with your Zune.

If you've got a Zune and/or the Zune Software installed on your machine you can subscribe with One Click with this link:

If you've got iTunes, you can subscribe with this link:

And if you're using a free Podcast downloader like FeedStation (and if not, why not?) then you can subscribe with the main URL:

Enjoy.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

.NET Framework Library Source Code available for viewing

January 16, '08 Comments [15] Posted in ASP.NET | Learning .NET | Microsoft | Programming
Sponsored By

It's live and you can give it a try now! Ten minutes ago Shawn and Scott released the hounds. If you'd like to step through .NET Framework Source code, here's what you need to do.

  1. Install this QFE.
    • Note, if you're on 64-bit Windows, read the description as there is a single manual step for 64-bit folks like me.
  2. Go into Tools|Options|Debugging|General and turn off "Enable Just My Code" and turn on "Enable Source Server."
  3. Go to Symbols and add this URL http://referencesource.microsoft.com/symbols and a local cache path. Make sure "search only when symbols are loaded manually" is checked.

That's it. Crazy. You can get more detail on Shawn's post if you need it. Here's me, just now, stepping into XPathNavigator's GetNamespacesInScope method.

Do note a few things.

  • Loading source the first time will be slow. There's lots of it. It'll be faster the second time.
  • If you can't right click and select Load Symbols from the Call Stack, try Ctrl-Alt-U and right click Load Symbols for the Module you want to step into.

ConsoleApplication1 (Debugging) - Microsoft Visual Studio (Administrator)

Fabulous. Enjoy.

Related Posts

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Long Term Viability of AppleTV

January 16, '08 Comments [50] Posted in Musings
Sponsored By

I just don't get AppleTV. I mean, I totally get it, I understand the intent, but I can do these things now, with either DVDs and Blockbuster Video or with my existing cable TV service. Certainly the seamlessness of the experience between iTunes, AppleTV and iPhone is a huge thing and amazing, but while iTunes and an iPod seem natural, AppleTV seems forced and stilted. I wonder if it'll really stick around for a number of years or if the studios look at it as "just another outlet."

On Demand Movies for $4 to $5

I've had the ability to rent movies on demand for YEARS on Comcast Cable, and they have had HD movies for two years. They aren't portable (see below) but certainly I can sit down and watch a movie instantly unless I'm too lazy to walk to the video store.

The Xbox360 also has On Demand movies in an almost identical way to the AppleTV, and the wife has started using that more and more. We watched "Hairspray" in HiDef and she was impressed with the experience. The benefit of course is that I already have an Xbox (as do 13 million other folks) and that it's a more versatile machine. It'd be cool if you could surf the web on an Apple TV and if it included a slot loading hi-def DVD player; that might make it more useful.

We find that a "DVD Total Access" pass is the best way for us to watch movies. We pay $20 and we get as many movies as we can turn around in the mail, which is usually ~6 a month or roughly $3 each. We can watch them anywhere, anytime, they don't expire or have late fees. I take them on planes and we watch half downstairs then take it upstairs to finish the last half. In this case, molecules are more portable than electrons for my family.

Take a Movie with You

It's a legal gray area, but I could also rip the rented CDs and watch them on my PSP or iPod, then delete them when I return the movie.

This, to me, is the #1 draw of the AppleTV. If you've got iPods and iPhones then being able to buy a movie in one place and watch it anywhere, even stopping at home and finishing on a plane. I can do this with DVDs that I get in the mail from Blockbuster, though, and they are excessively portable.

Storage For Your Own Content

Ripping and storing your own content to the AppleTV is the second most interesting feature I think, but that can be done with any NAS (Network Attached Storage) device and most any uPNP device, provided the codecs line up.

I kind of like having DVDs as storage, rather than the "psychic weight" of worrying about a hard drive crashing with 150 lovingly ripped DVDs sitting on it.

As the anonymous blogger at Shipping Seven says (caustically) about the lack of a DVD Drive on the new Macbook Air:

Dumping the DVD drive is a risky move. Yes, they are bulky, and are not used very much. But walk around any airplane/train, and you'll see a huge number of people with laptops watching movies.
Here's a hint, Apple: Not all those people are going to rent a movie off iTunes for a four-hour flight,
like you cheerfully propose. I can borrow a movie from my roommate's DVD collection. For free. For more than 24 hours. People generally pick the easiest and cheapest solution available to them.

It's true, folks like cheap; I like cheap.

Watch Photos on my TV

My TV, and many TVs, have an SD slot for photo slideshows, and the Xbox has both USB for docking a camera directly and uPNP, so this is interesting, but not incredibly so. If I could plug a digital camera directly into the AppleTV, that might be cool. (It has USB, can I do this now?)

Television Shows

Why would I want to pay $2 (TWO DOLLARS!) for a TV Show "the day after it airs" when I can watch it for free by visiting www.abc.com, www.cbs.com or www.nbc.com or any other Torrent site? And who wants to own a TV show. Why not 50 cents just to rent it? I'll wait until it comes out on DVD for those prices.

This is another example of where I think the Cable TV set-top boxes have advantage (today). For example, I get Showtime and I watch my favorite show, Dexter, on Showtime, but if I miss an episode, the entire season is sitting in the On Demand Menu for free. Why pay?

Utility

I really avoid buying gadgets unless they will fit into my, and my family's, lifestyle in a seamless and utilitarian way. The WAF (Wife Acceptance Factor) is about making everything "one button easy" like we have with the Harmony 880 Remote. When we moved to the new house I swapped out some equipment and we started using the Xbox as our primary DVD player. The wife was "shielded" from this because the Watch DVD button the Remote still worked as she expected.

I can see how an AppleTV could be a central part of one's media life, but I guess even though the Xbox is a totally different devices, perhaps, at least in my house, the Xbox has already taken its place as the "Box that does all things well."

Do you have an AppleTV and do you like it? Is this a gadget worth having? Is it indispensable like a GPS, MP3 Player or Tivo?

Related Posts

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Blog Stats are Confusing - GETs, Views, User-Agents, Readers, Eyeballs

January 15, '08 Comments [12] Posted in Musings
Sponsored By

There's some discussion going on an internal MSFT mailing list about blog statistics. I don't check my web statistics more than once a month, as I'm more interested in blog comments or what's going on in the forum. If I get a lot of comments on a post I feel good. I like to get discussions going and bounce ideas back and forth.

That said, some blogs at Microsoft track their statistics and need to know if a particular post or new theme brings in more readers. One particular blog (not mine) recently saw a 16x increase in "hits" which is probably a good thing. A discussion started, and here's part of an email I wrote with my ideas that I thought you might find interesting, Dear Reader. I've made a few [edits] to make things clearer.


I think it's killer, to be clear, so in no way do I want to take away from [that blog's] most excellent work, but the web stats [in this case] specifically "smells" wrong. Possibly a bot, spammer, something, but still, a 16x increase in web traffic [in a single] month feels exceptional. It's the ratios [of GETs to projected humans] that are confusing to me.

It'd be interesting to use some heuristics to turn the RSS Feed HTTP GETs into Unique users. For example, most RSS Readers poll so one individual will hit your feed (in my experience) between 8 and 16 times a day, depending on their reader and how long their computer is on. Online readers are smarter that Smart Client readers like Outlook and FeedDemon. This usually means one has fewer readers than they think, if they are looking at GETs.

Additionally, online readers [usually] only hit once (here's how that works) [and rather] "tunnel" your subscriber numbers in the HTTP User Agent like "NewsGatorOnline/2.0+(http://www.newsgator.com;+250+subscribers)". Meaning, you might get one hit or 10 hit, but regardless they are representative of 250 individuals. This usually means one has more readers than they think, if they are looking at GETs.

Why do I mention this? I mention it because looking at HTTP GETs isn't representative of people, but of GETs. It took me a few years to figure this out, and I've been thrilled with the analysis work done by FeedBurner (my RSS Feed is hosted there, saving me over 400 gigs of bandwidth a month) to turn GETs into Humans.

Here's a real world example. FeedBurner says I have around 22,000 regular readers [as of today...it varies based on weekday/weekend]. That's aggregated across all News Readers:

clip_image002

My stats package shows about 50,000 page views a day or about 1.6 million a month. This varies, confirming [an earlier] comment about folks hanging around [a site] and reading stories, which is cool. However, if I look at "hits" I see 16.5 million. Of course, that's not [a useful stat], because that included images, css, etc.  Visits, on the other hand are one individual hanging around for a period of time and reading. For example (these stats don't include RSS anywhere, including bandwidth):

Page Views - 1,596,548
Visits - 806,251
Hits - 16,500,422
Bandwidth (KB) - 209,759,564

For me, these stats make sense, because I have a readership of about 20,000 that show up every few days and hang out, representing [roughly] 50% of my traffic. The other 50% comes from Search Engines and [incoming] links from other blogs. So it's important that one distinguishes between hits, page views, and visitors, and tries to correlate those back to readership, IMHO.

The question that we need Blog Stats to answer is that of readership. What does [a] 600,000 RSS hits number mean? 600k/30days is about 20k hits a day, so how often are these readers hitting the feed per day? Once we come up with a standard-ish formula, blogs could get a rough +/-30% idea of how many human eyeballs [are actually reading].

Just my two cents, thoughts?

Related Posts

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.