Loading XmlSchema files out of Assembly Resources
I've been doing some validating of documents against an XSD lately. Validation is pretty straight forward, you take any XmlTextReader and wrap and run it through the XmlValidatingReader. The ValidationEventHandler will call you back if there's any trouble. You can poke around in the document if you like, while the validation happens, but when I'm just validating I do a while(reader.Read()) as you'll see.
I have a PILE of .XSD files - 64 of them - that represent a single specification. I load the most-leaf node to load whole spec:
XmlSchemaCollection schemas = new XmlSchemaCollection();
XmlReader reader = new XmlTextReader("TheMainSchema.xsd");
XmlSchema schema = XmlSchema.Read(reader, null);
XmlReader readerDoc = new XmlTextReader(TheFileYouWantToValidate.xml");
XmlValidatingReader newReader = new XmlValidatingReader(readerDoc);
newReader.ValidationEventHandler += new ValidationEventHandler(OnValidate);
while ( newReader.Read() );
I wanted an assembly that was self-contained and would hold all 64 of these XSD files internally as resources, and I didn't want to put them in a temp directory.
I added all the schemas to the project, right clicked "Properties" and set them all to Embedded Resources. When you request an embedded resource you need to ask for the file using the original file name as well as the namespace. Use Reflector to determine what the ultimate fully qualified resource name is if you have trouble.
It's easy to pull the main schema out of it's resource and pass the Stream into XmlSchema.Read. It's slightly less obvious how to get that schema to resolve its imports.
Schemas may reference other schemas like this:
In this, and most, cases schemaLocation refers to a relative file. However it could refer to a URL, or some custom scheme. Personally I find the "relative filename" style to be the most flexible. I don't like to bake too much knowledge about the outside world into my schemas. On this project, it's a (light) requirement that we use the specification schemas unchanged.
Note the instance call to XmlSchema.Compile. The XmlSchema class will use a FileSystemResolver by default and fail to find the other 63 schemas. So, I pass in a custom resolver that will find the correct schema given the URI (the value in the schemaLocation attribute) and return it, in this example, as a stream.
Here we just grab the relative filename from out of the file:/// URI that we're passed into GetEntity each time a schemaLocation needs to be resolved. Works like a charm. I wrap the whole thing in a factory method and cache the compiled XmlSchemaCollection so we don't load and compile this more than once.
There's a few ways one might want to extend this. I've seen folks build Assembly schemas like assembly:/// and embed stuff in the schemas, but eh, who has the time. This is simpler, IMHO and works for relative file locations and didn't take 10 minutes to write.
Quote of the day: I'm not a control freak, I'm a control enthusiast.