Scott Hanselman

Get namespaces from an XML Document with XPathDocument and LINQ to XML

January 17, '08 Comments [9] Posted in ASP.NET | LINQ | Programming | XML
Sponsored By

A fellow emailed me earlier asking how to get the namespaces from an XML document, but he was having trouble because the XML had some XML declarations like <?foo?>.

A System.Xml Way

XPathDocument has two cool methods, GetNamespace(localName) and GetNamespaceInScope, but they need a currentNode to work with.

 string s = @"<?mso-infoPathSolution blah=""blah""?>
              <?mso-application progid=""InfoPath.Document"" versionProgid=""InfoPath.Document.2""?>
              <my:ICS203 xml:lang=""en-US"" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
           xmlns:my=""http://schemas.microsoft.com/office/infopath/2003/myXSD/2007-04-03T19:03:38""            
xmlns:xd=""http://schemas.microsoft.com/office/infopath/2003""> <my:HeaderData/></my:ICS203>"; XPathDocument x = new XPathDocument(new StringReader(s)); XPathNavigator foo = x.CreateNavigator(); foo.MoveToFollowing(XPathNodeType.Element); IDictionary<string, string> whatever = foo.GetNamespacesInScope(XmlNamespaceScope.All);

Once you're on the right note, in this case the first element, you can call GetNamespacesInScope and get a nice dictionary that has what you need inside it.

namespaces

I really like the System.Xml APIs, they make me happy.

A System.Xml + LINQ to XML Bridge Methods Way

How could we do this with the LINQ to XML namespace? Well, pretty much the same way with a much nicer first line (yes, this could be made smaller).

 XDocument y = XDocument.Parse(s);
 XPathNavigator poo = y.CreateNavigator();
 poo.MoveToFollowing(XPathNodeType.Element);
 IDictionary<string, string> dude = foo.GetNamespacesInScope(XmlNamespaceScope.All);

Notice that the CreateNavigator hanging off of XDocument is actually an extension method that is there because we included the System.Xml.XPath namespace. There are a whole series of "bridge" methods that make moving between LINQ to XML APIs and System.Xml APIs seamless.

image

See the (extension) there in the tooltip? There's also a different icon for extension methods when they show up in Intellisense. See the small blue-arrow added next to CreateNavigator?

image

These helper methods are "spot-welded" on to existing object instances when you import a namespace that defines them. They are also called 'mixins.'

A Purely LINQ to XML Way

I also wanted to see how this could be done using LINQ to XML proper.

 Disclaimer: We are comparing Apples and Oranges here, so say, "wow that query is not as terse or compact as GetNamespacesInScope." We're comparing one layer of abstraction to a lower one. We could certainly make a mixin for XElements called GetNamespacesInScope and we'd be back where we started. The System.Xml method GetNamespacesInScope is hiding all the hard work.

Big thanks to Ion Vasilian for setting me straight with this LINQ to XML Query!

First we load the XML into an XDocument and ask for the attributes hanging off the root, but we just want namespace declarations.

XDocument z = XDocument.Parse(s);
var result = z.Root.Attributes().
        Where(a => a.IsNamespaceDeclaration).
        GroupBy(a => a.Name.Namespace == XNamespace.None ? String.Empty : a.Name.LocalName,
                a => XNamespace.Get(a.Value)).
        ToDictionary(g => g.Key, 
                     g => g.First());

Then we group them by namespace. Note the ternary operator ?: that returns "" for no namespace, else the namespaces local name as the key selector, and then gets an actual XNamespace.

Update: Ion wrote me and pointed out a mistake. I was calling z.Root.AncestorsAndSelf.Attributes, and I only needed to call z.Root.Attributes, or if I wanted to get all namespaces, z.Root.DescendantsAndSelf(). Thanks Ion!

Ion says: "z.Root.AncestorsAndSelf() says: from the root element of the document find all ancestors and the element itself. In other words only the root element. If you want to find the in-scope namespace declarations for a given element ‘e’, then on that element you’ll do e.AncestorsAndSelf(). In other words, starting from the given element ‘e’ walking up the ancestors path and including the element itself look for attributes that are namespace declarations and build a dictionary … Note that the question for in-scope namespace declarations is answered by walking up a path in the tree and not by doing a full traversal of the tree starting from a given point (a la e.DescendantsAndSelf())."

It can be confusing to figure out the various types of these variables like "a" and "g". In this example, "a" is an XAttribute because the call to Attributes() is of IEnumerable<XAttribute> and g is of type IGrouping<string, XNamespace>, gleaned from the expression inside of GroupBy().

We finish it off by taking the IGrouping and turning it into a dictionary with ToDictionary, selecting an appropriate key and the first At runtime "g" is an instance of System.Linq.Lookup<string, XNamespace>.Grouping which implements IGrouping, containing the namespace as an (and the only) element, subsequently retrieved with a call to First() and becomes the value side of the dictionary item.

image

Do also note one subtle detail. The System.Xml call to GetNamespacesInScope always includes the xmlns:xml namespace, declared implicitly. The LINQ query doesn't include this implicit namespace. Note also that these were sourced from XML Attributes and that the order of attributes is undefined (another way to say this is that attributes have no order.)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Friday, January 18, 2008 9:01:24 AM UTC
Great tip on GetNamespacesInScope method, so much cleaner than the technique I was using with MoveToFirstNamespace and MoveToNextNamespace methods. When .NET moved from 1.1 to 2.0 I didn't notice this useful new method come in.

A Minor point: I've just tried out the code, and because a Generic IDictionary is returned I had to modify the call to include the Generic part in the IDictionary declaration so I ended up with the following which seemed to work:

IDictionary<string, string> whatever = foo.GetNamespacesInScope(XmlNamespaceScope.All);

Is this right?

Also, fairly obviously, the default namespace gets an empty-string key value in the returned IDictionary.
Friday, January 18, 2008 9:58:08 AM UTC
Good catch, fixed the post!
Friday, January 18, 2008 3:13:41 PM UTC
Scott,

Thanks again! I needed to get all the namespaces loaded into a ns manager, so here's the loop code if anyone wants it:


foreach (KeyValuePair<string, string> xns in xpn.GetNamespacesInScope(XmlNamespaceScope.All))
{
xnm.AddNamespace(xns.Key, xns.Value);
}
Sam
Friday, January 18, 2008 7:20:33 PM UTC
Sam,

Please excuse a suggested extension to your code:

To cater for the default namespace, you could also add a check within the foreach loop for the empty string in the IDictionary key (this identifies the default namespace declaration), and then include your preferred prefix (e.g. 'def') to use instead to represent the default namespace in any XPath expressions, so you would end up with the following within your foreach loop:


...
if(xns.Key == String.Empty)
{
xnm.AddNamespace("def", xns.Value);
}
else
{
xnm.AddNamespace(xns.Key, xns.Value);
}
...


There may be no default namespace declaration in your case of the xml instance, but this is just in case other readers come across this who are new to namespaces.
Tuesday, January 22, 2008 10:04:13 AM UTC
Then how to delete one namespace? say i want to delete the namespace
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
samu
Wednesday, January 23, 2008 8:07:37 AM UTC
now there's some poo where there should be foo ;)
David
Tuesday, February 12, 2008 11:34:45 AM UTC
Thanks ! Good tip ! I didn't noticed this addition neither to .NET 2.0.

However, I wonder why, if this is so simple to get the namespaces from the DOM using an XPathNavigator, why should we give a namespace manager each time we need to do an XPath query in an XML file. There could be kind of default behavior that would create automatically this manager based on the GetNameSpaceInScope no ?

In there any sense to force a user to give them ?
Tuesday, February 19, 2008 6:00:43 PM UTC
What a nice post!
Yeab
Friday, February 29, 2008 8:52:47 AM UTC
Greate! This post is also very usefull for me...

same question like Pierre-Emmanuel, is there any sense to force a user to give a XMLNamespaceManager?
Benjamin
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.