Scott Hanselman

Adventures in Debugging - Expensive Semicolons and Invalid GIFs

August 10, '05 Comments [13] Posted in Bugs
Sponsored By

Ah, yes crazy bugs, they are my life. Here's today's saga. We did this from 9:30am until lunch, so we were able to figure it out in about two and half hours.

One of our systems retrieves Check Images (pictures of cleared checks). The Checks move through the system as Base64'ed strings and are eventually the separate front and back checks are displayed in the user's browser as a single image using a dynamic compositing technique I mentioned a while back.

However, it seemed that when we took the decoded from BASE64 schmutz and did basically this to convert the GIF to a JPEG:

using(MemoryStream m = new MemoryStream(bytes))
{
    using (System.Drawing.Image image = System.Drawing.Image.FromStream(m))
    {
        Response.ContentType = "image/jpeg";
        image.Save(Response.OutputStream,ImageFormat.Jpeg);    
    }
}

We'd get an error from the bowels of System.Drawing that there was an "invalid parameter." Reflectoring showed that Image.FromStream is managed spackle over a GDI+ method.

[DllImport("gdiplus.dll", CharSet=CharSet.Unicode, ExactSpelling=true)]
internal static extern int GdipLoadImageFromStreamICM(UnsafeNativeMethods.IStream stream, out IntPtr image);
You may remember that there was a GDI+ crackdown recently (that continues today) so I wondered aloud if the file was corrupted in some way and GDI+ was being conservative. Loading the file into Windows Picture and Fax Viewer gave me this - bupkes (??????).

Badgif

I tried loading it into a number of picture viewers, most of which said nope. Surprisingly, IE didn't have a problem with it. This is odd to me because I thought the GDI+ security fixes would apply to IE, but not so.  

Goodgif

To review - I've got a weird GIF that shows up in IE, but that .NET and GDI+ refuse to recognize. I could look for other image libraries that would "clean" the GIFs but that's reaching. The mainframe/host system that generates and holds these GIF isn't likely to change, and even if it did it wouldn't be fast enough for this implementation.

We could just pass it all the way through the system unmolested as the GIF that it is. This would WORK but only until browsers like IE became more security aware and started slapping down invalid GIFs like this one.

So, these GIFs are invalid. But how? As with all things for me, I begin with Notepad2. I opened a bad example check image into Notepad2:

Badgifinnotepad2

First I notice that it's a GIF87a. Noteworthy only like an old piece of gray paper from Kindergarten is noteworthy. Then we (Patrick and I - at this point I've drafted him) notice that the alphabet and numbers appears a hundred bytes in. We figured that's the color table as they are triplets and this is a grayscale gif of 128 colors. But, without getting all 0xHex-y this early on, what else can I do to determine if this is a valid GIF or not? Well, I got it to display in IE before. I'll copy it (now a bitmap) to the clipboard and save it as a GIF. It'll likely save as a GIF89 because, hey, it's like 2 better, right?

Goodgifinnotepad2

Here's the same graphic saved again. Ya, it looks totally different, so you assume my copy/paste was an invalid thing to do (in the scientific method sense). Well, hang in there, it gets worse. It's clearly a GIF89a and it clearly has a different color table. Otherwise, nothing here jumps out when comparing them with our eyes.

At this point, it's time to bite the bullet and decode the GIF header. We figure a GIF can be corrupt in two ways, either the header is bogus or the image data is. We'll do the easy one first. Time to pull out the June 15th, 1987 GIF spec from Compuserve.

Working structure by structure we produced this little nugget of uselessness:

    using (FileStream f = File.Open(@"C:\Documents and Settings\shanselm\Desktop\bad.gif",FileMode.Open))
    //using (FileStream f = File.Open(@"C:\Documents and Settings\shanselm\Desktop\good.gif",FileMode.Open))
    {
        using(BinaryReader reader = new BinaryReader(f))
        {
            string sigversion = new string(reader.ReadChars(6));
            if (sigversion.StartsWith("GIF"))
            {
                ushort width = reader.ReadUInt16();
                ushort height = reader.ReadUInt16();
 
                byte someshit = reader.ReadByte();
                int colortable = someshit & 0x7;
 
                byte bgcolor = reader.ReadByte();
                byte apsectratio = reader.ReadByte();
 
                int logicallength = (int)Math.Pow(2,colortable+1);
                int colortablelength = (int)(3 * logicallength);
 
                //Color table, yuck. RGB is a struct elsewhere in our file.
                // It's a value type, that's why we poke it back in at the bottom of the loop.
                RGB[] RGBs = new RGB[logicallength];
                for (int i = 0; i < logicallength; i++)
                {
                    RGB rgb = RGBs[i];
                    rgb.R = reader.ReadByte();
                    rgb.G = reader.ReadByte();
                    rgb.B = reader.ReadByte();
                    RGBs[i] = rgb;
                }
 
                //Image Descriptor
                byte imageseparator = reader.ReadByte();
                uint leftpos = reader.ReadUInt16();
                uint toppos = reader.ReadUInt16();
                uint widthagain = reader.ReadUInt16();
                uint heightagain = reader.ReadUInt16();
                byte localcolortableflags = reader.ReadByte();
 
                int localcolortablepresent = localcolortableflags & 0x80;
                int interlace = localcolortableflags & 0x40;
                int sortbit = localcolortableflags & 0x20;
                int localcolortable = localcolortableflags & 0x07;
 
                //We figured if the header was bad we'd mess with it in this process somewhere
                // then if we fixed it in the byte[], we'd fall through to the code below
                // that previously hadn't worked. If Image.FromStream did work, we'd have fixed the bug
                // Of course, we got all the way here and there wasn't anything wrong with the GIF header!            
            }
        }
 
        using (MemoryStream m = new MemoryStream(bytes))
        {
            using (Image image = Image.FromStream(m))
            {
                image.Save("foo.jpg",ImageFormat.Jpeg);
            }                    
        }   

Well, crap. We made it all this way and there didn't appear to be ANYTHING (per spec) wrong with the GIF header. We checked everything out in the Watch Window line by line. Nothing.

Ok, back to differences. How about checking them out in Beyond Compare?

Gifsinbc

Zoom in on that baby. Look real close. Notice in the upper left corner, there's not many differences. Remember that the old GIF87 is on the left, and the new one that I made via COPY/PASTE is on the right. The basic image data is the same, cool. So, really the only differences are the header, a byte or two in the middle, and what? What's that at the VERY BOTTOM RIGHT CORNER? A semicolon? In the valid image? WTF is that?

Hm...back to the spec. Since we've just decoded the header, perhaps there's a footer/trailer/terminator.

June 15, 1987

(c) CompuServe Incorporated, 1987

 GIF TERMINATOR

In order to provide a synchronization for the termination of a GIF

   image  file,  a  GIF  decoder  will process the end of GIF mode when the

   character 0x3B hex or ';' is found after an image  has  been  processed.

   By  convention  the decoding software will pause and wait for an action

Lovely. Do we have an off-by-one? Are we dropping the last byte as we go through the system?

We go back to the system that sites just ahead of the mainframe check imager and request an image. We look at the byte array returned, and notice that the LAST BYTE IS MISSING. The images are trasmitted on a secure internal network using HTTP. The Content/Type is image/gif and the Content-Length HTTP Header, in this case, was 20814. That was exactly how many bytes were received.

So here's the question (that hasn't been answered):

Is it more likely that the host system has or is generating bogus/bad/invalid GIFs or that the Content-Length HTTP Header being returned by their unknown kind of Web Server is off by one and System.Net.HttpWebRequest is trusting what it's being told? I vote bogus GIFs, Patrick thinks bad Content-Length. Not sure if we'll ever know.

The fix, of course, was to check if the byte array representing this kind of GIF is terminated with 0x3b or not, and if not, append it. Once 0x3b was appended, System.Drawing and GDI+ had NO problem with the bytes.

Crisis averted. Chao continues.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Wednesday, August 10, 2005 10:59:24 PM UTC
You, sir, are awesome
Wednesday, August 10, 2005 11:18:05 PM UTC
I think it's really kewl that the financial industry has adopted blogging! Wait till scoble finds out that checks are moved through the system as base64 encoded blogs. ;)
Thursday, August 11, 2005 12:16:46 AM UTC
Crap...fixing...you suck, Phil. :)
Thursday, August 11, 2005 1:14:03 AM UTC
So are you off by one on "Chaos" or are you talking about a Sega creature (http://images.google.com/images?q=Chao&hl=en&lr=&sa=N&tab=wi)?
Thursday, August 11, 2005 1:22:31 AM UTC
From SF:
( 2005-08-09 22:37:23 - Project CVS Service ) The CVS upgrade and sync has been completed. The CVS server is now online and completely operational.

Detritus
Thursday, August 11, 2005 2:39:59 AM UTC
That really was some excellent commentary. :)

Now for something completely off-topic: I couldn't help but notice the way you structure your using blocks, i.e. indenting for each using statement rather than something like this:

using (MemoryStream m = new MemoryStream(bytes))
using (Image image = Image.FromStream(m))
{
// Do stuff
}

Just wondering if there's a special reason for doing that, or if (rather more likely, I suspect) that just happens to be the style upon which your company standardized.

--Stuart
Stuart
Thursday, August 11, 2005 2:16:49 PM UTC
Stuart-

The using block as written above is a lot easier to read than the one you posted.

Thursday, August 11, 2005 3:05:29 PM UTC
Ben, I happen to disagree with you and agree with Stuart for nested using blocks (but ONLY if theare are pure nestings without any other code following the inner block).
Thursday, August 11, 2005 4:33:08 PM UTC
I agree with both of you. I DO stack usings (look at the listing for the image compositing technique!) but didn't feel like it here as I thought this was very readable.
Thursday, August 11, 2005 8:42:08 PM UTC
Always enjoyable to read a good debugging story. Why not check what is on the wire using a packet sniffer to see if the content-length is wrong? If you're looking at the incoming TCP packets (which won't know anything about Content-Length), you should be able to see whether the terminating semi-colon is present or not.
Friday, August 12, 2005 12:15:54 AM UTC
James, I couldn't figure out how to sniff (YATT) when the link ONLY accepts SSL. Remember this is a system I have no control over...is there an easy way to do a man in the middle "sniff"? I suppose I could use ProxyTrace...I think I'll try that.
Friday, August 12, 2005 4:52:53 AM UTC
WFetch works pretty well for that kind of thing. It supports HTTPS and can save output to a file. You can also set custom headers, etc. Of course, you still have to mess with the file format, but that's the fun part.
Friday, August 12, 2005 5:28:19 PM UTC
SSL makes it tough. If it's a webservice call, you can use the latest version of SoapScope as it supports SSL. From what you've said, it's a legacy system with a custom webserver...

If you're doing it through raw urls to a webserver, you could use a SSL Tunnel to make the request. Here's the setup that I tried.
1. Run stunnel (from http://www.stunnel.org) on a port such as 8080, which redirects to DestinationServer:443.
2. Fire up Fiddler (http://www.fiddlertool.com).
3. Browse to http://localhost:8080/PathToGif/cheque.gif, where /PathToGif/cheque.gif is whatever the server path is. Note that we're using HTTP, not HTTPS here.
4. You can view the binary data in Fiddler.

This is what IE receives. Unfortunately there's no guarentee that your network stack didn't muck with it before it got to IE. Given that the traffic is unencrypted coming out of stunnel, you should be able to use a regular old sniffer such as YATT or Ethereal (http://www.ethereal.com) to grab the packets. You may have to deploy stunnel on another computer to ensure that there is network traffic. (If you host stunnel on localhost, you won't see any network traffic as localhost resolves to a faster in-memory transport from what I can tell.)

Another idea would be to intentionally mangle the content-length of gifs coming back from your own HttpListener and see what happens on the receiving end.
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.