Scott Hanselman

Scott Hanselman: Developer Productivity Tools Video Part 2

July 19, '06 Comments [4] Posted in Reviews | PowerShell | XmlSerializer | TechEd | Speaking | Web Services | Tools
Sponsored By

Wroxvideo2When I was at TechEd I visited the Beantown.net INETA User Group and gave a (fairly ad-hoc) talk on Developer Productivity Tools. Jim Minatel loaned me his microphone and a copy of Camtasia and we recorded the talk. Thanks Jim!

It was a great crowd, a lot of fun. We had a number of "off the cuff" discussions about random stuff so I hope it doesn't take away from the gist of the talk.

The complete presentation was around 1 hour 45 minutes, so for online, Jim has split it into 4 segments. This week's segment is #2 and is available now and is about 20 minutes long. If you watch it in your browser, I recommend you double click on Windows Media Player to make the video go full screen. You can also download the full video.

It covers:

  • 00:00 Title
  • 00:15 Scott's introduction (repeated from the first video segment)
  • 00:40 XmlSerializer
  • 8:40 Interlude: SlickRun and Google
  • 9:10 Back to XmlSerializer
  • 10:40 SlickRun
  • 12:20 Explorer2 and Launching apps with Google Desktop Search
  • 13:20 Far - A Windows application like DOS Norton Commander
  • 14:35 Why Scott has so much stuff on his desktop
  • 16:40 Junctions and reparse points
  • 19:30 Closing credits

The remaining two segments for following weeks will cover roughly:

  • Week 3: Windows PowerShell - 33 minutes
  • Week 4: Active Words, Code Rush, SOAP Scope, XML doc viewer - 23 minutes

Here's a few notes about the video quality from Jim:

1. Why can't I fast forward or skip ahead through the video while it's streaming? Answer: We're running these off of a standard IIS server, not a Windows Media Server. IIS supports streaming, but not indexed playback during streaming to allow skipping ahead. If you want to do that, just download the whole video and all of the forwarding and timeline controls will be available in Windows Media Player.

2. Why isn't the video quality better? Is Camtasia to blame? No, Camtasia rocks. The raw videos I'm getting in Camtasia format are 100% clear, as if you were looking right at the presenter's monitor. The problem I've discovered is with the Windows Media Encoder. It just isn't well suited to on-screen presentation videos like this. The blurring and color blotching seems worst in Scott Hanselman's videos and I think I know why. When I watch the raw presentation, he's flying back and forth between open windows, background tools that pop up, and his desktop. It's just faster switching between very varied images than the encoder can seem to keep up with. I've twidled all the settings and got the best I can for now without doubling or tripling the file sizes. The other option would be to post an alternate version in Camtasia format and a link to download their playback codec [Scott: or a large FLV]. Because WMV is universal for my .NET developer audience, that has to be my common choice though.

There's also some other good screencasts up at Wrox. The growing list of videos is available at wrox.com. The first few videos in the series are:

I hope you enjoy them.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

ScriptBlock and Runspace Remoting in PowerShell

July 15, '06 Comments [2] Posted in PowerShell | XmlSerializer | Web Services | Bugs
Sponsored By

When I first saw grokked PowerShell I wanted to be able to issue commands remotely. There's some great stuff going on with PowerShellRemoting over at GotDotNet, but that's remoting of the User Interface.

I want to be able to issue commands to many different machines in a distributed fashion.

After some pair programming with Brian Windheim, we set up a Windows Service that would get a string of commands and return a string that was the output o those commands. I could then issue remote commands, but the result at the client was just strings. I was in PowerShell but I'd just made the equivalent of PSEXEC for PowerShell...so basically I'd gotten nowhere.

Ideally I'd like to have behavior like this (but I don't):

using (Runspace myRunSpace = RunspaceFactory.CreateRunspace("COMPUTERNAME"))

{

    myRunSpace.Open();
}

But a Runspace is local and inproc. I don't see a really obvious and straightforward way to do this, considering that there's LOTS of internal and private stuff going on within PowerShell.

I liked that the string in, string out remoting stuff worked fine, but really I want to get Objects back from the remote machine. So, I started using Reflection to poke around inside System.Management.Automation.Serializer, but that got evil quickly. Truly evil.

Then I had an epiphany and remember the Export-CliXml cmdlet. It is the public cmdlet that uses the serializer I was trying to get to. It isn't the XmlSerializer. It's a serialized graph of objects with a rich enough description of those objects that the client doesn't necessarily need the CLR types. If reflection had a serialization format, it might look like this format.

Now, if I take the commands I was issuing to the remote "invoker" and export the result of the pipeline to this function XML format, I've just discovered my remoting server's wire format.

This RunspaceInvoker type is hosted in a Windows Service, but it could be in any Remoting hosting process. I'll likely move it inside IIS for security reasons. The app.config for my service looks like this:

<?xml version="1.0"  encoding="utf-8" ?>

<configuration xmlns="http://schemas.microsoft.com/.NetConfiguration/v2.0">

  <system.runtime.remoting>

    <customErrors mode="Off"/>

    <application>

      <channels>

        <channel ref="http" port="8081"/>

      </channels>

      <service>

        <wellknown mode="SingleCall"

                   type="Hanselman.RemoteRunspace.RunspaceInvoker,
                   Hanselman.RemoteRunspace
" objectUri="remoterunspace.rem"/>

      </service>

    </application>

  </system.runtime.remoting>

</configuration>

Note the objectUri and port. We'll need those in a second. There's an installer class that I run using installutil.exe. I set the identity of the Windows Service and it starts up with net start RemoteRunspaceService.

This is the RunspaceInvoker (not the best name):

 public class RunspaceInvoker : MarshalByRefObject

 {

    public RunspaceInvoker(){}

 

    public string InvokeScriptBlock(string scriptString)

    {

        using (Runspace myRunSpace = RunspaceFactory.CreateRunspace())

        {

            myRunSpace.Open();

 

            string tempFileName = System.IO.Path.GetTempFileName();

            string newCommand = scriptString +
                " | export-clixml " + "\"" + tempFileName + "\"";

            Pipeline cmd = myRunSpace.CreatePipeline(newCommand);

 

            Collection<PSObject> objectRetVal = cmd.Invoke();

 

            myRunSpace.Close();

 

            string retVal = System.IO.File.ReadAllText(tempFileName);

            System.IO.File.Delete(tempFileName);

            return retVal;

        }

    }

 }

A command for the remote service comes into the scriptString parameter. For example we might pass in dir c:\temp as the string, or a whole long pipeline. We create a Runspace, open it and append "| export-clixml" and put the results in a tempFileName.

THOUGHT: It's a bummer I can't put the results in a variable or get it out of the Pipeline, but I think I understand why they force me to write the CLI-XML to a file. They are smuggling the information out of the system. It's the Heisenberg Uncertainly Principle of PowerShell. If you observe something, you change it. Writing the results to a file is a trapdoor that doesn't affect the output of the pipeline. I could be wrong though.

Anyway, this doesn't need to be performant. I write it to a temp file, read the file in and delete it right away away. Then I return the serialized CLI-XML to the caller.The client portion is two parts. I probably should make a custom cmdlet, but I didn't really see a need. Perhaps someone can offer me a reason why.

For simplicity I first made this RunspaceProxy. Remember, this is the class that the client uses to invoke the command remotely.

    public class RunspaceProxy

    {

        public RunspaceProxy()

        {

            HttpChannel chan = new HttpChannel();

            if (ChannelServices.GetChannel("http") != null)

            {

                ChannelServices.RegisterChannel(chan, false);

            }

        }

 

        public Collection<PSObject> Execute(string command, string remoteurl)

        {

            RunspaceInvoker proxy = (RunspaceInvoker)Activator.GetObject(
                   typeof(RunspaceInvoker), remoteurl);

            string stringRetVal = proxy.InvokeScriptBlock(command);

 

            using (Runspace myRunSpace = RunspaceFactory.CreateRunspace())

            {

                myRunSpace.Open();

                string tempFileName = System.IO.Path.GetTempFileName();

                System.IO.File.WriteAllText(tempFileName, stringRetVal);

                Pipeline cmd = myRunSpace.CreatePipeline(
                    "import-clixml " + "\"" + tempFileName + "\"");

                Collection<PSObject> retVal = cmd.Invoke();

                System.IO.File.Delete(tempFileName);

                return retVal;

            }

        }

    }

I'm using the HTTP channel for debugging and ease of use with TcpTrace. The command to be executed comes in along with the remoteUrl. We make a RunspaceInvoker (the class we talked about a second ago) on the remote machine and it does the work via a call to InvokeScriptBlock. The CLI-XML comes back over the wire and now I have to make a tempfile on the client. Then, in order to 'deserialize' - a better word would be re-hydrate - the Collection of PSObjects, I make a local Runspace and call import-clixml and poof, a Collection<PSObject> is returned to the client. I delete the file immediately.

Why is returning real PSObjects so important when I had strings working? Because now I can select, sort, and where my way around these PSObjects as if they were local - because they are. They are real and substantial. This will allow us to write scripts that blur the line between the local admin and remote admin.

Now, all this has been C# so far, when does the PowerShell come in? Also, since I've worked so hard (well, not that hard) to get the return values integrated cleanly with PowerShell, what's a good way to get the remote calling of scripts integrated cleanly?

My first try I made a function RemoteInvoke() that took a command string. It worked, but felt tacky. Then I remembered how Jeffrey Snover said to look to Type Extensions when adding functionality rather than functions and cmdlets.

I made a My.Types.ps1xml file in my PSConfiguration directory and put this in it:

<Types>

  <Type>

    <Name>System.Management.Automation.ScriptBlock</Name>

    <Members>

      <ScriptMethod>

        <Name>RemoteInvoke</Name>

        <Script>

          if ($GLOBAL:remoteUrl -eq $null) { throw 'Set $GLOBAL:remoteUrl first!' }


          [System.reflection.assembly]::LoadWithPartialName("System.Runtime.Remoting") |
               out-null

          $someDll = "C:\foo\Hanselman.RemoteRunspace.dll"

          $asm = [System.Reflection.Assembly]::LoadFrom($someDll) | out-null


          $runspace = new-object Hanselman.RemoteRunspace.RunspaceProxy


          $runspace.Execute([string]$this, $GLOBAL:remoteUrl);

        </Script>

      </ScriptMethod>

    </Members>

  </Type>

</Types>

Then called Update-TypeData My.Types.ps1xml (actually it's in my profile so it happens automatically.)  This file adds a new method to the ScriptBlock type. A ScriptBlock is literally a block of script. It's a very natural "atom" for us to use.

NOTE: I'd like to have the RemoteUrl be a parameter to the RemoteInvoke ScriptMethod, but I can't fine really any documentation on this. I'll update it when I figure it out, but for now it uses a $GLOBAL variable and freaks out if it's not set.

The RemoteInvoke loads the .NET System.Runtime.Remoting assembly, then it loads our Proxy assembly. Then it calls Execute, casting the [ScriptBlock] to a [string] because the Runspace takes a string.

For example, at a PowerShell prompt if I do this:

PS[79] C:\> $remoteUrl="http://remotecomputer:8081/RemoteRunspace.rem"

PS[80] C:\PS[80] C:\> 2+2
4

PS[81] C:\> (2+2).GetType()

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     True     Int32                                    System.ValueType

PS[82] C:\> {2+2}.GetType()

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     False    ScriptBlock                              System.Object

PS[83] C:\> {2+2}
4

PS[84] C:\> {2+2}.RemoteInvoke()
4

PS[85] C:\>
{2+2}.RemoteInvoke().GetType()

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     True     Int32                                    System.ValueType

Note the result of the last line. The value that comes out of RemoteInvoke is an Int32, not a string. The result of that ScriptBlock executing is a PowerShell type that I can work with elsewhere in my local script.

Here's the CLI-XML that went over the wire (just to make it clear it's not XmlSerializer XML):

<Objs Version="1.1" xmlns="http://schemas.microsoft.com/powershell/2004/04">

  <I32>4</I32>

</Objs>

This 2+2 stuff is a terse and simple example, but this technique works with even large and complex object graphs like the FileInfos and FileSystemInfo objects that are returned from dir (get-childitem).

Remoterunspace

In this screenshot we do a get-process on the remote machine then sort and filter the results just as we would/could if the call were local.

My WishList for the Next Version of PowerShell

  • All this stuff I did, built in already with security and wonderfulness.
  • All the stuff in PowerShellRemoting, with security and wonderfulness.
  • Some kind of editor or schema installed in VS.NET for editing My.Types.ps1xml.
  • TabExpansion for all Types in the current AppDomain (this of course, already done by MonadBlog and MOW).

Thanks again to Brian Windheim for the peer programming today that jump started this!

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Scott Hanselman: Developer Productivity Tools Video Part 1

July 11, '06 Comments [7] Posted in TechEd | PowerShell | XmlSerializer | Speaking | Web Services | Tools
Sponsored By

Wroxvideo1When I was at TechEd I visited the Beantown.net INETA User Group and gave a (fairly ad-hoc) talk on Developer Productivity Tools. Jim Minatel loaned me his microphone and a copy of Camtasia and we recorded the talk. Thanks Jim!

It was a great crowd, a lot of fun. We had a number of "off the cuff" discussions about random stuff so I hope it doesn't take away from the gist of the talk.

The complete presentation was around 1 hour 45 minutes, so for online, Jim has split it into 4 segments. This week's segment is available now and is about 33 minutes long. If you watch it in your browser, I recommend you double click on Windows Media Player to make the video go full screen.

It covers:

  • 00:00 Title
  • 00:15 Scott's introduction
  • 00:40 The first tool: Notepad 2
  • 02:40 Little-known built-in command line tools
  • 09:55 Process Explorer and Slick Run
  • 15:15 ILDASM
  • 16:15 .NET Reflector
  • 24:45 NGEN

The remaining three segments for following weeks will cover roughly:

  • Week 2: XmlSerializer - 20 minutes
  • Week 3: Windows PowerShell - 33 minutes
  • Week 4: Active Words, Code Rush, SOAP Scope, XML doc viewer - 23 minutes

There's also some other good screencasts up at Wrox. The growing list of videos is available at wrox.com. The first few videos in the series are:

I hope you enjoy them.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Subtle Behaviors in the XML Serializer can kill

May 25, '06 Comments [3] Posted in XmlSerializer | Bugs
Sponsored By

Dan Maharry is/was having a heck of a time with the XmlSerializer after upgrading an application from .NET 1.1 to .NET 2.0.

Given this XSD/schema:

<element name="epp" type="epp:eppType" />
<complexType name="eppType">
  <choice>
    <element name="hello" />
    <element name="greeting" type="epp:greetingType" />
  </choice>
</complexType>

The .NET 1.1 framework serializes a greeting element thusly (actually by incorrect and lucky behavior in the 1.x serializer):

<?xml version="1.0" encoding="utf-8"?>
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
  <greeting>
    <svID>Test</svID>
    <svDate>2006-05-04T11:01:58.1Z</svDate>
  </greeting>
</epp>

but although it seemed to be fine initially in .NET 2.0, he started getting this instead.

<?xml version="1.0" encoding="utf-8"?>
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
  <hello d2p1:type="greetingType" xmlns:d2p1="http://www.w3.org/2001/XMLSchema-instance">
    <SvID>Test</SvID>
    <svDate>2006-05-04T10:55:07.9Z</svDate>
  </hello>
</epp>

Dan worked with MS Support and filed a bug in the Product Feedback labs and attached an example if you'd like to download it.

Unfortunately, this isn't a bug. The problem is caused by the ordering of the elements in the original schema causing the XmlElement attributes to stack in the same order resulting in the wrong semantics:

   [System.Xml.Serialization.XmlTypeAttribute(Namespace = "urn:ietf:params:xml:ns:epp-1.0", TypeName = "eppType")]
   [System.Xml.Serialization.XmlRootAttribute("epp", Namespace = "urn:ietf:params:xml:ns:epp-1.0", IsNullable = false)]
   public class EppType
   {
      private object item;

      [System.Xml.Serialization.XmlElementAttribute("hello", typeof(object))]
      [System.Xml.Serialization.XmlElementAttribute("greeting", typeof(GreetingType))]
      public object Item
      {
         get
         {
            return this.item;
         }
         set
         {
            this.item = value;
         }
      }
   }

The problem is that the semantics of the schema and the resulting XmlSerializer attributes say "This object can be either an object or a GreetingType." Well, a GreetingType IS an object, so the 2.0 serializer happily obliges.

Reversing those two lines in the XSD and regening the CS file with XSD.EXE expresses the correct intent. "This object can be a GreetingType or any other kind of object." and the expected (original) output is achieved. If Dan can't change the original schema (which is likely wrong) then he'll have to change the generated code to get the semantics he wants. Not a bad thing, actually. I did the same thing with the code generated from the OFX schemas.

Using a previously published tip called HOW TO: Debug into a .NET XmlSerializer Generated Assembly I add an app.config with these lines:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <system.diagnostics>
    <switches>
      <add name="XmlSerialization.Compilation" value="1" />
    </switches>
  </system.diagnostics>
</configuration>

And check the contents of the Temp Directory by going Start|Run and typing in "%temp%" and pressing enter. I then sort by Date Modifed.

The contents of my temp folder

I run the test program twice, once the original way and once with the lines reversed (my "fix") and diff the geneated .cs files in BeyondCompare.

Debuggingserializer3

You can see from the picture above exactly where the difference is, in the middle of a series of if/elseifs that basically are saying "what kind of object is this?"

The XmlSerializer is glorious and wonderful until it's totally not. I know that's not going to make Dan or his team feel better, but hang in there, it gets better the more you use it.

UPDATE: Dan has an interesting update that points out that the order of the attributes generated isn't regular, nor is the order they come back via reflection. James weighs in as well. My solution worked because there were only two attributes. Nutshell - order matters, but it's not regular.

I'm not defending the XmlSerializer folks, although it may sound like I am. James says "it looks like a bug to me." Personally I think it's less a bug and more a complex and poorly documented edge case that highlights the funamental differences between the XML type system and the CLR type system. At the edges, it's dodgey at best.

I think where we're all getting nailed here is that that XSD Type System can represent things that the CLR Type System can't. Full stop.

In Schema, xs:choice is a complex thing, much like unions in C. The XmlSerializer chooses to present xs:choice as a Object that you have to downcast yourself. The mapping is uncomfortable at best. However, beyond this one uncomfortable type mapping, there are structures you can present in Schema that simply have no parallel in the CLR and the mappings won't ever been 100%. This is just what happens when translating between type systems. The same thing happened(s) for years with nullable DB columns as simple types got translated into the CLR and we leaned on IsDBNull. With the XmlSerializer they introduced the whole and parallel field with a "Specified" suffix.

In this instance, if it were me using this schema and dealing with these documents I'd switch over to implementing IXmlSerializable. IXmlSerializable provides coverage for the final few percent that the XmlSerializer doesn't provide.  It doesn't solve the problem of mapping between type-systems, but it at least puts YOU in control of the decisions being made.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

XmlValidatingReader problems over derived XmlReaders

March 12, '06 Comments [3] Posted in Web Services | XmlSerializer | Bugs
Sponsored By

This whole validation ickiness deserved two posts, so I didn't mention it in the last XmlValidatingReader post.

The XML format that I'm parsing and validating isn't the savvyest of formats as it was created years ago before the XML Schema specification was complete. While it has a namespace and it's an official specification, the instance documents don't have a namespace. They are entirely "unqualified." So, basically I'm trying to validate XML documents with a namespace against a schema that expects namespaces.

Additionally, the elementFormDefault is set to "unqualified." There's a great explanation of what elementFormDefault means here.

The documents come in like this:

<FOO>
  <BAR>text</BAR>
</FOO>

Before I'd look hard at the schema I had assumed that I could load them with an XmlNamespaceUpgradeReader. This is a derivation of XmlTextReader that does nothing but lie about the Namespace of every element. I'm using System.Xml on .NET 1.1.

public class XmlNamespaceUpgradeReader : XmlTextReader

    {

        string oldNamespaceUri;

        string newNamespaceUri;

 

        public XmlNamespaceUpgradeReader( TextReader reader, string oldNamespaceUri, string newNamespaceURI ):base( reader )

        {

            this.oldNamespaceUri = oldNamespaceUri;

            this.newNamespaceUri = newNamespaceURI;

        }

 

        public override string NamespaceURI

        {

            get

            {

                // we are assuming XmlSchemaForm.Unqualified, therefore

                // we can't switch the NS here

                if ( this.NodeType != XmlNodeType.Attribute &&

                    base.NamespaceURI == oldNamespaceUri )

                {

                    return newNamespaceUri;

                }

                else

                {

                    return base.NamespaceURI;

                }

            }

        }

    }

For example, if I did this:

XmlTextReader reader = new XmlNamespaceUpgradeReader(
    File.OpenText("MyLameDocument.xml"),
    String.Empty,
    "http://thenamespaceiwant"); 


XmlDocument doc = new XmlDocument();

doc.Load(reader);

Console.WriteLine(doc.OuterXml);

I would end up with this resulting XML:

<FOO xmlns="http://thenamespaceiwant">
  <BAR xmlns="
http://thenamespaceiwant">text</BAR>
</FOO>

Seemed like this would validate. Well, not so much. The document, as you can see, is fine. It's exactly what you'd expect. But, the I remember/noticed that the document was elementFormDefault="unqualified" meaning that only the root node needs the namespace. So...

public class XmlRootNamespaceUpgradeReader : XmlTextReader

{

    string oldNamespaceUri;

    string newNamespaceUri;

 

    public XmlRootNamespaceUpgradeReader( TextReader reader, string oldNamespaceUri, string newNamespaceURI ):base( reader )

    {

        this.oldNamespaceUri = oldNamespaceUri;

        this.newNamespaceUri = newNamespaceURI;

    }

 

    public override string NamespaceURI

    {

        get

        {

            // we are assuming XmlSchemaForm.Unqualified, therefore

            // we can't switch the NS here

            if ( Depth == 0 && this.NodeType != XmlNodeType.Attribute &&

                    base.NamespaceURI == oldNamespaceUri )

            {

                return newNamespaceUri;

            }

            else

            {

            return base.NamespaceURI;

            }

        }

    }

 

    public override string Prefix

    {

        get

        {

            if(Depth == 0 && this.NodeType == XmlNodeType.Element)

            {

                return "x";

            }

            return null;

        }

    }

 

}

...which results in a document like this:

<x:FOO xmlns:x="http://thenamespaceiwant">
  <BAR
>text</BAR>
</x:FOO>

This document should now validate, and it fact it does in my test applications. When the document is loaded directly from a test file it works fine. When I run it directly through one of the extended "fake-out" XmlTextReaders, it doesn't work. It's as if my readers don't exist at all, even though their code does indeed execute.

To be clear:

Original Doc -> XmlTextReader -> XmlValidatingReader -> doesn't validate (as expected)
Original Doc -> XmlNamespaceUpgradingReader -> XmlValidatingReader -> doesn't validate (but it should!)
Original Doc -> XmlNamespaceUpgradingReader -> XmlDocument -> write to file -> read from file -> XmlValidatingReader -> doesn't validate (as expected, it's "overqualified")
Original Doc -> XmlRootNamespaceUpgradingReader -> XmlDocument -> write to file -> read from file -> XmlValidatingReader -> DOES VALIDATE (as expected)

Why don't the "fake-out" XmlTextReaders work when chained together and feeding the XmlValidatingReader directly, but they do work when there's an intermediate format?

A few things about the XmlValidatingReader in .NET 1.1 (since it's obsolete in 2.0). While its constructor takes the abstract class XmlReader, internally it insists on an XmlTextReader. This is documented, but buried IMHO. Reflector shows us:

XmlTextReader reader1 = reader as XmlTextReader;
if (reader1 == null)
{
    throw new ArgumentException(Res.GetString("Arg_ExpectingXmlTextReader"), "reader");
}

<conjecture>When a class takes an abstract base class - the one it "should" - but really requires a specific derivation/implementation internally, it's a good hint that the OO hierarchy wasn't completely thought out and/or a refactoring that was going to happen in a future version never happened.</conjecture>

Regardless, System.Xml in .NET 2.0 is much nicer and as well though-out as System.Xml 1.x was, 2.0 is considerably more thought out. However, I'm talking about 1.1.

<suspicion>I take this little design snafu as a strong hint that the XmlValidatingReader in .NET 1.1 has carnal knowledge of XmlTextReader and is probably making some assumptions about the underlying stream and doing some caching rather than taking my fake-out XmlReader's word for it.</suspicion> 

If you're on, or were on, the System.Xml team let me know what the deal is and I'll update this post.

I know that the XmlRootNamespaceUpgradingReader works because the XML is correct when it's written out to an intermediate. However, the InfoSet that the XmlValidatingReader acts on is somehow now the same.  How did we solve it? Since XmlValidatingReader needs an XmlTextReader that is more "legit," we'll give it one

Original Doc -> XmlRootNamespaceUpgradingReader -> XmlDocument -> CloneXmlReader -> XmlValidatingReader -> DOES VALIDATE

This is cheesy, but if a better way is found at least it's compartmentalized and I can fix it in one place. We quickly run through the input XmlTextReader, write the Infoset out to a MemoryStream and return a "fresh" XmlTextReader and darn it if it doesn't work just fine.

/// <summary>

/// Makes an in memory complete, fresh COPY of an XmlReader. This is needed

/// because the XmlValidatingReader takes only XmlTextReaders and isn't fooled

/// by our XmlNamespaceUpgradingReader.

/// </summary>

/// <param name="reader"></param>

/// <returns></returns>

protected XmlTextReader CloneReader(XmlTextReader reader)

{

    MemoryStream m = new MemoryStream();

    XmlTextWriter writer = new XmlTextWriter(m,Encoding.UTF8);

    while (reader.Read())

    {

        writer.WriteNode(reader,false);

    }

    writer.Flush();

    m.Seek(0,SeekOrigin.Begin);

    XmlTextReader returnedReader = new XmlTextReader(m);

    return returnedReader;

}

Madness. Many thanks to Tomas Restrepo for his help and graciousness while debugging this issue!

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb
Previous Page Page 2 of 7 in the XmlSerializer category Next Page

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.