I've noticed a lot of folks who still do COM development using MSXML (2, 3, or 4) and MSXML2.DOMDocument or MSXML2.FreeThreadedDOMDocument in wrongheaded ways. I wanted to make folks aware of a few tips and thoughts around these two components from experience and elsewhere around the net.
MSXML exposes DOMDocument and FreeThreadedDOMDocument. They are DIFFERENT and using one vs. the other (depending GREATLY on how) can make a 7x to 10x diference in performance. XML is very powerful, but remember that even though it's easy to use [Xml.Load("somexml.xml")] it's also easy to GREATLY slow your code down in VERY few lines of code.
The "free-threaded" DOM document exposes the same interface as the "rental" threaded document. This object can be safely shared across any thread in the same process. Free-threaded documents are generally slower than rental documents because of the extra thread safety work they do. You use them when you want to share a document among multiple threads at the same time, avoiding the need for each of those threads to load it's own copy.
If you do need to share objects between threads, you have two choices:
1. use "FreeThreadedDOMDocument", which exposes all the same interfaces as DOMDocument, but is multi-thread safe (with a corresponding performance hit due to internal locking and synchronization). It can be safely stored in ASP Application state on IIS.
For C++ people:
2. Change your threads to use single-threaded apartments(COINIT_APARTMENTTHREADED) and then marshall interface pointers between your threads (See CoMarshalInterThreadInterfaceInStream).
Note: The fastest way to load an XML Document (assuming you're loading it into a DOM) The fastest way to load an XML document is to use the default "rental" threading model (which means the DOM document can be used by only one thread at a time) with validateOnParse, resolveExternals, and preserveWhiteSpace all disabled, like this in (for exampleÂ…note it could happily be in VBScript) Javascript:
var doc = new ActiveXObject("MSXML2.DOMDocument"); doc.validateOnParse = false; doc.resolveExternals = false; doc.preserveWhiteSpace = false; doc.load("somexml.xml");
If you have an element-heavy XML document that contains a lot of white space between elements and stored in Unicode, it can actually be smaller in memory than on disk. Files that have a more balanced ratio of elements to text content end up at about 1.25 to 1.5 the UCS-2 disk file size when in memory. Files that are very data-dense, such as an attribute - heavy XML - persisted ADO recordset, can end up more than twice the disk-file size when loaded into memory. NOTE: These tips are for the COM MSXML Components and do NOT represent best practices in .NET. .NET has much richer and highly nuanced XML support. Also note that loading XML into a DOM is fairly slow anywhere since you're loading XML in to a pre-parsed and indexed tree. An example of an inefficient operation would be to load a 3000k XML file into a DOM, then perform a SelectNodes("//somenode") in order to retrieve a single value. [Listening to: The Verve Pipe - The Freshman (Special Version)]
Scott at DevReach in Bulgaria in October
Developer Stand up Comedy - Coding 4 Fun
TechDays/DevDays Netherlands and Belgium:
Posts by Category Posts by Month
Greatest Hits Dev Tools List