Scott Hanselman

SQL 2005 Create XML Schema Collection weirdness

January 10, '06 Comments [3] Posted in XML
Sponsored By

SQL 2005 XML CollectionsThis is weird, and while I'm an XML wonk, I haven't been a SQL wonk in a while. I created a new column in a database with the new SQL 2005 XML type. Then I went to the W3C XHTML page and copy/pasted the XSD for XHTML Strict. Since the syntax is

CREATE XML SCHEMA COLLECTION NorthwindCollection AS 'your xml schema here'

I needed to escape the quotes, so I converted all ' to '' via a selection-based replace. So then I had:

Use Northwind;
CREATE XML SCHEMA COLLECTION NorthwindCollection AS
'<?xml version="1.0" encoding="UTF-8"?>
<xs:schema version="1.0" xml:lang="en"
    xmlns:xs="
http://www.w3.org/2001/XMLSchema"
    targetNamespace="
http://www.w3.org/1999/xhtml"
    xmlns="
http://www.w3.org/1999/xhtml"
    xmlns:xml="
http://www.w3.org/XML/1998/namespace"
    elementFormDefault="qualified">
    ...etc...lots more schema here...
'

But I got this error:

Msg 2206, Level 16, State 1, Line 3
Namespace prefix 'xml' can only be associated with the URI 'http://www.w3.org/XML/1998/namespace' and this URI cannot be used with other prefixes.

Which is weird, because the prefix IS obviously associated with the correct namespace and used correctly in the xml:lang attribute on the root node. Here's where it gets weird. If you REMOVE the xmlns:xml line from the ordinarily correct schema, it works fine and is added to SQL 2005's types section as a value XML Schema Collection. (see image)

It appears that this namespace is hard-coded inside the SQL 2005 somewhere along with the xml: prefix.

UPDATE: The reserved prefix is documented, but a smidge buried for my taste. Thanks to Harry for the pointer. I'm still not clear on why the namespace itself has to be removed from the document.

Use Northwind;
CREATE XML SCHEMA COLLECTION NorthwindCollection AS
'<?xml version="1.0" encoding="UTF-8"?>
<xs:schema version="1.0" xml:lang="en"
    xmlns:xs="
http://www.w3.org/2001/XMLSchema"
    targetNamespace="
http://www.w3.org/1999/xhtml"
    xmlns="
http://www.w3.org/1999/xhtml"
    xmlns:xml="
http://www.w3.org/XML/1998/namespace"
    elementFormDefault="qualified">
    ...etc...lots more schema here...
'

I have no conclusion here (yet) or idea why this works this way, but one day someone may need to get the XHTML schema into SQL Server to store notes or something for a content management system and this post may help them. I'll update it as I get more information or as I become less ignorant.

Now playing: Goapele - Intro

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb
Tuesday, 10 January 2006 07:43:50 UTC
Before I worked on Xbox backwards compatibility, I worked on SqlXml, XQuery, and Microsoft.Xml.Query.

This is just one of those stupid implementation details, a quirk of Microsoft's XML implementation. Feel free to bug Michael Rys about it at http://sqljunkies.com/weblog/mrys/. Not only is the XML datatype his baby, but also he's currently soliciting public feedback on it.

Personally, I think the official XML Schema for XHTML should be accepted without modification. Otherwise, how can you write automated web services using a SQL backend? Do you have to manually strip out every namespace declaration involving a pre-defined prefix? Yuck!

[I hope this doesn't post multiple times. I'm having trouble with the comment tool.]
Tuesday, 10 January 2006 19:37:47 UTC
Interesting problem...

Does anyone know if SQLServer is parsing XML with MSXML 6.0, or their own internal implementation? I think the "root" of the problem is not SQLServer, but maybe MSXML (bad XML pun intended). .NET v1.1 and MSXML throw exceptions given this code as well, SQLServer might not have any control over it.

While I would agree the behavior could be improved upon, I find nothing wrong with throwing an exception when trying to declare an "xml" namespace prefix. That xml prefix is already present implicitly, so you can think of it as you are declaring it twice. The spec is clear that it should be there implicitly, but not clear on whether you should be allowed to add it explicitly. Based on the current level of understanding of the spec by 95% of the people out there, the parser would be easier to use if it just throws an error for all "xml" prefixes declared -- it is far simpler to tell people 'you don't (and can't) declare an "xmlns:xml" namespace', rather than telling them:

"'The prefix xml is by definition bound to the namespace name http://www.w3.org/XML/1998/namespace' (http://www.w3.org/TR/REC-xml-names/). So you can't declare anything other than 'http://www.w3.org/XML/1998/namespace' and it's already declared implicitly so you don't have to declare it yourself, you get it for free."

For those (using .NET 1.1) that read this post down the road, if you have come across this situation and you are getting an XML doc from a system that does include the xml prefix, just create your own XmlReader, rather than letting .NET create one for you. You can either parse and validate yourself, or just create an XmlValidatingReader and pass that to an XmlDocument's Load() method. Use the ValidationEventHandler to check for and suppress that particular error. I'm of the mind that all this interop using XML is easy--you just have to know the quirks of your input, fix or ignore them, and go on from there. (It's the "unknown quirks" that scare me :) )

Finally, I find it interesting that in .NET 2.0, the behavior goes away (the parser ignores this error and lets you declare the xml namespace--if you are using the correct "http://www.w3.org/XML/1998/namespace"). However, if you use MSXML 6.0, you get the reserved namespace error. Conclusion? I'm guessing SQLServer 2005 relies on MSXML 6.0 for it's parsing. If you have problems with XML you can't control, use .NET 2.0 (within SQLServer). Create a function that you pass your external XML to, load it up, strip that xml namespace, and return the clean XML. Use the result with SQLServer as you would have normally. Slower sure, but at least it can be worked around. Maybe future versions of MSXML or SQLServer will follow .NET 2.0's lead and ignore this, if you use it correctly.
Tim K
Wednesday, 11 January 2006 01:18:27 UTC
Since the XML and .NET XML teams are part of SQL Server, it doesn't really matter.

I agree it's not a big deal. It's just one of those "death by a thousand stings" issues, where some number of the schemas you try to use has an inconsistencies.

Because this is mostly only relevant when creating XML datatype columns (i.e., administration) instead of inserting data, it's not exactly a hot issue.
Michael Brundage
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.