Scott Hanselman

How to load HTML into mshtml.HTMLDocumentClass with UCOMIPersistFile and my ignorance

June 25, '04 Comments [6] Posted in PowerShell
Sponsored By

What a weird one.  I'm looking at the source for NDoc.Document.HtmlHelp2.Compiler.HtmlHelpFile.  It uses the Microsoft.mshtml interop Assembly to load an HTML file into the HTMLDocumentClass for easy parsing.

It's code looks like this (DOESN'T WORK):

private HTMLDocumentClass GetHtmlDocument( FileInfo f )
{
  HTMLDocumentClass doc = null;
  try
  {
    doc = new HTMLDocumentClass();
    UCOMIPersistFile persistFile = (UCOMIPersistFile)doc;
    persistFile.Load( f.FullName, 0 );
    int start = Environment.TickCount;
    while( doc.body == null ) 
    {
      if ( Environment.TickCount - start > 10000 )
      {
        throw new Exception( string.Format( "The document {0} timed out while loading", f.Name ) );
      }
    }
  }
}

I went searching as it was taking up 100% CPU for an hour and never completed.  Now I know why! :)

What's weird is this, the only way I could get it to work (as IPersistFile is loading on another Thread) was with this change (NOW IT WORKS):

private HTMLDocumentClass GetHtmlDocument( FileInfo f )
{
  HTMLDocumentClass doc = null;
  try
  {
    doc = new HTMLDocumentClass();
    UCOMIPersistFile persistFile = (UCOMIPersistFile)doc;
    persistFile.Load( f.FullName, 0 );
    int start = Environment.TickCount;
    while( doc.readyState != "complete" )
  

     
System.Windows.Forms.Application.DoEvents();
      if ( Environment.TickCount - start > 10000 )
      {
        throw new Exception( string.Format( "The document {0} timed out while loading", f.Name ) );
      }
    }
  }
}

When I Reflector into DoEvents() I can see that it's doing more than a Sleep(0) (yield), it's actually running the message pump.  Am I missing something?  Apparently IPersistFile needs the message pump?  Well, it works, but it's gross.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Effective XML Document with C#, NDoc, Lutz's Documentor and the Microsoft HTML Help Workshop

June 25, '04 Comments [10] Posted in ASP.NET | Web Services | XML | Tools
Sponsored By

A lot has been said on this topic, but here's some stuff you might not know.

We use NDoc to document our API.  NDoc consumes the XML Documentation files created as a build artifact by the C# compiler. 

Most links online that talk about XML Documentation in stop at <summary> and the standard <param> stuff.  Method documentation is interesting, but the real meat happens at the top of your Class declarations.  That's where most of the prose is in MSDN documentation.  Take a look at the Remarks section of the Socket Class in the MSDN Help for example

To achieve such a rich structure, organize your XML help thusly on the top of each class declaration:

  • <Summary>
  • <Remarks>
    • <Note>(s)
  • <Example>(s)
    • <Code>
  • <SeeAlso>(s)

The XML Comment snippet below, along with NDoc produced the lovely MSDN-style documentation in the picture at right.

The <summary> tag explains the “point” of the class. A sentence or two is fine here.

/// <summary>
/// Encapsulates Thingie and a Connection to Thingie in a Service wrapper.
/// Other Services will be built with this building block.
/// </summary>

The <remarks> tag is where the real prose goes. Use this tag to describe the general use of the class, as well as any notes, gotchas, or significant design or architectural issues.

Note the use of <para> to separate paragraphs. Use <see> to refer to namespaces, classes or methods. If you don’t include the fully qualified namespace, the documenter will assume the current namespace.

/// <remarks>
/// <para>The ThingieService class contains a <see cref="IConnector"/>
/// that is pulled from the named element in the config/ThingieClient.config file. The config file
/// is loaded by <see cref="I"/>.</para>
/// <para>Note that the constructor is private, as your application gets a ThingieService by calling the static <see cref="GetThingieService"/> method.
/// From there, the ThingieService hides the <see cref="IConnector"/> and
/// is the primary interface to Thingie. The ThingieService shouldn't be used directly from an ASP.NET
/// page. Instead, it should be used from either a generated or hand-written proxy.</para>
/// <note type="note">ASP.NET developers should use <see cref="Corillian.Thingie.Security.SiteSecureThing"/> to property register a <see cref="ThingiePrincipal"/> with the system to effectively use the ThingieService.</note>
/// <para>There are two ways to call the <see cref="Execute"/> method.</para>
/// <list type="bullet">
/// <item><term>Pass in an object that implements <see cref="IRequest"/>.
/// The Thingie SessionID and UserID will be retrieved from the <see cref="ThingiePrincipal"/> on the current Thread.</term></item>
/// <item><term>Pass in an object that implements <see cref="IRequest"/> along with the Thingie SessionID and UserID as additional parameters.
/// Use this method if your Thread doesn't already have a ThingiePrincipal. One will be made for you temporarily.</term></item>
/// </list>
/// </remarks>
/// <example>
/// <code>
/// public class BankingExample
/// {
/// protected ThingieService thingie = null;
///
/// public BankingExample()
/// {
/// thingie = ThingieService.GetThingieService("BankingServiceProxy");
/// }
///
/// public virtual SignonResponse Signon(SignonRequest req, string userId, string somethingElse )
/// {
/// string sessionid = thingie.SomethingImportantToTheSession(userId);
/// string r = thingie.Execute(req, sessionid, userId);
/// SignonResponse retVal = SignonResponse.FixUpSomething(r);
/// return retVal;
/// }
/// }
/// </code>
/// </example>
/// <seealso cref="IRequest"/>
/// <seealso cref="IConnector"/>
/// <seealso cref="IConfig"/>
/// <seealso cref="ILoggingService"/>

public class ThingieService....

I also like to use Lutz Roeder's .NET Documentor to write my XML Comments in.  It has a split-pane view and a nice right-click menu that let's me see what the documentation will look like AS I TYPE.

Considering that I'd need to recompile the Application AND generate the MSDN documentation, this little tool is a big time saver.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Code, and Ninjas you can see...

June 24, '04 Comments [6] Posted in Bugs
Sponsored By

Patrick Cauldwell and I had lunch today and were talking about how funny it is when someone looks at code for hours trying to find a bug (or stray semi-colon) and the parallel was made with the Ninjas that The Tick couldn't see.  They would hold sticks in to disguise themselves as shrubbery.  As soon as they did this, they were immediately invisible to The Tick.  As soon as they moved the sticks, well, you get the idea.

I wonder what makes folks not see the Code that's right in front of them?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

BATCH FILE VOODOO: Determine if multiple (and which) versions of an MSI-installed Product are installed using UpgradeCode

June 23, '04 Comments [1] Posted in Tools | XML
Sponsored By

We have a product that supports side-by-side installs, and I wanted to enumerate all the versions of that product and display it's version/product name.  Sure, it could be done in yet-another-C# program, but why not use Batch?

Every MSI installer has an UpgradeCode GUID that can't change for the life of the product.  The ProductCode and PackageCode may change, but the UpgradeCode is what tells us if the product is already installed.  At that point the installer can decide if it wants to upgrade or just install along side.  Those UpgradeCodes (for all apps) are in HKCU\Software\Microsoft\Installer\UpgradeCodes\ with SubKeys for each Product while the Products are in HKCU\Software\Microsoft\Installer\Products\ with details for each like Version, ProductName, etc. In this example I'm getting the ProductName which includes the version for me, but of course you can get anything you want.

NOTE: We're using REG.EXE which is included in the PATH on 2003 and XP but is an add-on to Windows 2000.

Here's the voodoo (all one line).  Note - The delims= includes a TAB, then a space:

FOR /F "tokens=1 skip=4 delims=        " %%A IN ('REG QUERY HKCU\Software\Microsoft\Installer\UpgradeCodes\<yourUpgradeCodeGUIDHere> /s') DO FOR /F "tokens=5 delims=            " %%Z IN ('REG QUERY HKCU\Software\Microsoft\Installer\Products\%%A /v ProductName') DO ECHO %%Z

Here it is broken down:

FOR /F "tokens=1 skip=4 delims=           " %%A
  
IN (
       'REG QUERY HKCU\Software\Microsoft\Installer\UpgradeCodes\<yourUpgradeCodeGUIDHere> /s'
      )
   DO 
     FOR /F "tokens=5 delims=            " %%Z 
         IN (
             'REG QUERY HKCU\Software\Microsoft\Installer\Products\%%A /v ProductName'
             ) 
         DO 
            ECHO %%Z

Thanks to Rob van der Woude's great (and oft-updated) Batch File site.

UPDATE: Some interesting details on compressed GUIDs from John Walker. Thanks John!:

"Scott,

Thanks for the help with REG.exe and FOR /F. My requirement was to be
able to uninstall any installed version of a product in an automated
fashion via a batch file. Because we don't know the exact product codes
that might be installed, we start our search with the upgrade code.

Initially, I had some trouble determining the values to use for:
<yourUpgradeCodeGUIDHere>.

What I found is that MSIs record 'compressed' GUIDs in the registry for
upgrade and product codes. So if you want to search the registry for
UpgradeCodes, you will need to search for the compressed code.
If <yourUpgradeCodeGUIDHere> is:
       {abcdefgh-ijkl-mnop-qrst-uvwxyz123456}
Then the compressed version you should search for is:
       hgfedcbalkjiponmrqtsvuxwzy214365

http://www.appdeploy.com/messageboards/tm.asp?m=11996&mpage=1&#12037
explains the process for converting an upgrade code.


I also searched a different location to determine product codes at:
HKCR\Installer\UpgradeCodes\ based on information on Windows Installer
registry locations from:
http://msdn2.microsoft.com/en-US/library/aa367758.aspx because we
install per-machine.

The value(s) stored under an UpgradeCode path in the registry will be
compressed product codes. In order to run an msiexec uninstall you need
an uncompressed code in the {8-4-4-4-12} format. I ended up writing a
small console app to do the translations between compressed and
uncompressed codes. (Why does MSI compress the codes?) Here is the meat
of the translator (errorhandling and usage message code removed):

   class GuidCompressor
   {
       /// <example>GuidCompressor.exe {abcdefgh-ijkl-mnop-qrst-uvwxyz123456} N
       /// returns: hgfedcbalkjiponmrqtsvuxwzy214365</example>
       /// <example>GuidCompressor.exe hgfedcbalkjiponmrqtsvuxwzy214365 B
       /// returns: {abcdefgh-ijkl-mnop-qrst-uvwxyz123456}</example>
       static void Main(string[] args)
       {
           Guid origGuid = new Guid(args[0]);
           //outputFormat should be N, D, B, P
           string outputFormat = args[1];

           string raw = origGuid.ToString("N");
           char[] aRaw = raw.ToCharArray();
           //compressed format reverses 11 byte sequences of the original guid
           int[] revs
               = new int[]{8, 4, 4, 2, 2, 2, 2, 2, 2, 2, 2};
           int pos = 0;
           for (int i = 0; i < revs.Length; i++)
           {
               Array.Reverse(aRaw, pos, revs[i]);
               pos += revs[i];
           }
           string n = new string(aRaw);
           Guid newGuid = new Guid(n);
           //GUID in registry are all caps.
           Console.WriteLine(newGuid.ToString(outputFormat).ToUpper());
       }
   }


Combining the GuidCompressor, UpgradeCodeGuid, and HKCR registry
location here is the batch file that I use to uninstall:
Delims are <tab><space>

SET UncompressedUpgradeCodeGUID={abcdefgh-ijkl-mnop-qrstuvwxyz123456}
SET logfile=C:\temp\mylogfile.txt
SET binpath=C:\tools
FOR /F %%A IN ('%binpath%\GuidCompressor.exe
%UncompressedUpgradeCodeGUID%   N') DO (
       FOR /F "tokens=1 skip=4 delims=  " %%B IN ('REG QUERY
HKCR\Installer\UpgradeCodes\%%A /s') DO (
               FOR /F %%C IN ('%binpath%\GuidCompressor.exe %%B   B')
DO (
                       %windir%\System32\msiexec.exe /x %%C /lvx*
%logfile% /qb
               )
       )
)


This research was tested on 32bit versions of WinXP and Win2k3."

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

A multi-request-safe ViewState Persister

June 23, '04 Comments [7] Posted in ASP.NET | ViewState
Sponsored By

Mark Miller has posted his code for a ViewStatePersister using the "common sense but not obvious" GUID technique that was outlined previously by Scott Mitchell and myself.

He stores a GUID in the ViewState hidden field, and sticks the bloated ViewState in a temp file on the server.  It doesn't solve the problem when running multiple web servers while using stateless balancing (meaning: NOT using sticky sessions/node affinity) but it's the most elegant and complete solution I've seen yet and should work great on a single web server. 

A few questions I have though:

  • When do the files get cleaned up and how often? Do you clean up old ones in a background thread within ASP.NET or a separate Windows Service?  Thought: I wonder if you could delete them after immediately after the Load?  You wouldn't be able to RE-post data, but it'd be cleaner, no?
  • GUID generation is very expensive, and can really slow you down under load.  I wonder if it would be faster/easier to have a single long and use InterlockedIncrement or  InterlockedIncrement64 to safely increase the value on each call until it overflows and you start again at 0.

Many thanks to Mark for sharing!

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.