Scott Hanselman

Hold me...

June 24, '03 Comments [0] Posted in Web Services | XML
Sponsored By

When I see things like this on the 'Net, I am trapped in that sad space between tears and laughter. 

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <soap:Body>
           <unsignedByte>C4<unsignedByte>
    </soap:Body>
</soap:Envelope>

Update: Someone who reads my blog chatted me thinking that the hex contained within (0xC4) represented C4 Plastic Explosive.  Whoa...NOT.  I was just wallowing in the overhead of the SOAP Envelope, the unused namespaces, the UTF-8 encoding, not to mention HTTP, etc, being used to transmit a single byte. That's all. No hidden message. :)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

An Oldie but a Goodie: MSXML and FreeThreadedDomDocument vs. DomDocument

June 24, '03 Comments [1] Posted in Web Services | Javascript | XML
Sponsored By

I've noticed a lot of folks who still do COM development using MSXML (2, 3, or 4) and MSXML2.DOMDocument or MSXML2.FreeThreadedDOMDocument in wrongheaded ways. I wanted to make folks aware of a few tips and thoughts around these two components from experience and elsewhere around the net.

MSXML exposes DOMDocument and FreeThreadedDOMDocument. They are DIFFERENT and using one vs. the other (depending GREATLY on how) can make a 7x to 10x diference in performance. XML is very powerful, but remember that even though it's easy to use [Xml.Load("somexml.xml")] it's also easy to GREATLY slow your code down in VERY few lines of code.

DOMDocument

These objects use what's known as the "Rental" model of threading. That means that they can be accessed from any thread, but only one thread at a time. As long as you're not trying to share DOM objects between threads, you're fine with these objects.

Best Practices for DomDocument

  • When using Single Threaded EXEs (VB6, etc) and manipulating Xml with in a single transaction, you're working on a single thread -> Use DomDocument.
  • When using Classic ASP and manipulating Xml within a single page request, you're working on a single thread -> Use DomDocument

FreeThreadedDOMDocument

The "free-threaded" DOM document exposes the same interface as the "rental" threaded document. This object can be safely shared across any thread in the same process. Free-threaded documents are generally slower than rental documents because of the extra thread safety work they do. You use them when you want to share a document among multiple threads at the same time, avoiding the need for each of those threads to load it's own copy.

If you do need to share objects between threads, you have two choices:

1. use "FreeThreadedDOMDocument", which exposes all the same interfaces as DOMDocument, but is multi-thread safe (with a corresponding performance hit due to internal locking and synchronization). It can be safely stored in ASP Application state on IIS.

For C++ people:

2. Change your threads to use single-threaded apartments(COINIT_APARTMENTTHREADED) and then marshall interface pointers between your threads (See CoMarshalInterThreadInterfaceInStream).

Best Practices for FreeThreadedDomDocument

  • When using Classic ASP and storing Xml in the Application Object or Session Object (which is a questionable practice, can affect performance, and is not recommend for the inexperienced) -> Use FreeThreadedDomDocument
  • There is not any good reason that I can think of to use FreeThreadedDomDocument in a Single Threaded Exe (unless you're marshalling it off somewhere)

Note: The fastest way to load an XML Document (assuming you're loading it into a DOM) The fastest way to load an XML document is to use the default "rental" threading model (which means the DOM document can be used by only one thread at a time) with validateOnParse, resolveExternals, and preserveWhiteSpace all disabled, like this in (for exampleÂ…note it could happily be in VBScript) Javascript:

var doc = new ActiveXObject("MSXML2.DOMDocument");
doc.validateOnParse = false;
doc.resolveExternals = false;
doc.preserveWhiteSpace = false;
doc.load("somexml.xml");

If you have an element-heavy XML document that contains a lot of white space between elements and stored in Unicode, it can actually be smaller in memory than on disk. Files that have a more balanced ratio of elements to text content end up at about 1.25 to 1.5 the UCS-2 disk file size when in memory. Files that are very data-dense, such as an attribute - heavy XML - persisted ADO recordset, can end up more than twice the disk-file size when loaded into memory.

NOTE: These tips are for the COM MSXML Components and do NOT represent best practices in .NET. .NET has much richer and highly nuanced XML support. Also note that loading XML into a DOM is fairly slow anywhere since you're loading XML in to a pre-parsed and indexed tree. An example of an inefficient operation would be to load a 3000k XML file into a DOM, then perform a SelectNodes("//somenode") in order to retrieve a single value.

[Listening to: The Verve Pipe - The Freshman (Special Version)]

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Big Picture of XML or Diagram of a Big Ball of Goo?

June 23, '03 Comments [0] Posted in Web Services | XML
Sponsored By

Which Are The Core XML Technologies That Matter?. I stumbled on the The Big Picture of the XML Family of Specifications which lists a large number of technologies that are related to XML in one way shape or form. It seems some people take a look at the diagram and it gives them the impression that XML is too complex after all, just look at all those specs. An interesting fall out of this has been that some fellow B0rg have posted their opinions on what they consider the core of XML. [Dare Obasanjo aka Carnage4Life]

<brainstorming>
You know, this Big Picture is mostly fabulous. It's a great reference (perhaps a potential T-Shirt?), but it really DOES look a lot more complex that XML feelsOn the downside, while this giant document may be a fairly complete list of specs, when you look at it from a Tuftetian point of view, it doesn't make any useful qualitative judgements that we can infer via position.  If you apply Don's "kernel of XML" thoughts to a diagram like this, it would need to take into consideration the size and position of the boxes in perhaps a family tree, or perhaps a modified Venn Diagram.  It'd be nice if the a new diagram took into consideration time and dependancy, with sizes derived from relative importance (based on dependance).
</brainstorming>

Anyway, whether XML is complex, or not complex, or more complex than COM ever was, I care not.  XML just naturally feels right to me. (Remember to linger across the ee's in feels with the appropriate emphasis :)  Complexity doesn't always imply level of difficulty or demand that one impugn.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Brummesque?

June 20, '03 Comments [0] Posted in Web Services
Sponsored By

More bloggers...fellow RD Jon Box is up and running, as well as Eric Gunnerson.  Welcome!  

Eric's first post was about his Robot Vacuum...I look forward to more compelling Brummesque content from this Program Manager on the C# team. ;) <g>

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.