Scott Hanselman

Moving ViewState to the Bottom of the Page

October 13, '05 Comments [7] Posted in ASP.NET | ViewState
Sponsored By

I was working on some ASP.NET hacks and wanted to move the ViewState to the bottom of the page in order to get Google to pay more attention to my page and less to the wad of Base64'ed ViewState.

First I tried this because it's the closest to the way my mind works:

static readonly Regex viewStateRegex = new Regex(@"(<input type=""hidden"" name=""__VIEWSTATE"" 
value=""[a-zA-Z0-9\+=\\/]+"" />)",
RegexOptions.Multiline|RegexOptions.Compiled);
static readonly Regex endFormRegex = new Regex(@"</form>",
RegexOptions.Multiline|RegexOptions.Compiled);
 
protected override void Render(HtmlTextWriter writer)
{
    //Defensive coding checks removed for speed and simplicity. 
    // If these don't work out, you've likely got bigger problems.
    System.IO.StringWriter stringWriter = new System.IO.StringWriter();
    HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter);
    base.Render(htmlWriter);
    string html = stringWriter.ToString();
    Match viewStateMatch = viewStateRegex.Match(html);
    string viewStateString = viewStateMatch.Captures[0].Value;
    html = html.Remove(viewStateMatch.Index,viewStateMatch.Length);
    // This will only work if you have only one </form> on the page
    Match endFormMatch = endFormRegex.Match(html,viewStateMatch.Index);
    html = html.Insert(endFormMatch.Index,viewStateString);
    writer.Write(html);
}

However, it was taking 1 thousanth of a second (~0.001230s) to do the work and that didn't feel right. Of course, by taking over the HtmlTextWriter and spitting it out as a string I've boogered up all the benefits of buffering and the whole streaming thing, but it still felt wrong.

So, against my better judgement, I did it again like this:

protected override void Render(System.Web.UI.HtmlTextWriter writer) 
{
    System.IO.StringWriter stringWriter = new System.IO.StringWriter();
    HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter);
    base.Render(htmlWriter);
    string html = stringWriter.ToString();
    int StartPoint = html.IndexOf("<input type=\"hidden\" name=\"__VIEWSTATE\"");
    if (StartPoint >= 0) 
    {
        int EndPoint = html.IndexOf("/>", StartPoint) + 2;
        string viewstateInput = html.Substring(StartPoint, EndPoint - StartPoint);
        html = html.Remove(StartPoint, EndPoint - StartPoint);
        int FormEndStart = html.IndexOf("</form>") - 1;
        if (FormEndStart >= 0) 
        {
            html = html.Insert(FormEndStart, viewstateInput);
        }
    }
    writer.Write(html);
}

I always assumed (mistake #1) that IndexOf was pretty expensive, particularly on larger strings. However, this method averaged out at 0.000995s. It consistently beat the Regex one, even though the Regex one was very simple, the Regexes were precompiled and (I think) simple.

Now, to be clear, I'm just playing here, and I know it's microperf and premature optimization. The really interesting thing would be to do a matrix of page size vs. viewstate size. You know, large page, small viewstate against small page, large viewstate and all points in between, then try it with both techniques and see which is better for these different scenarios. But, I'm tired and have other things to do, so if you like, there's some homework for you. What does this data set look like: viewstate size vs. page size vs. technique?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Thursday, October 13, 2005 4:28:21 AM UTC
F**kin' viewstate. (spits on ground) If you can't turn it off altogether, it definitely belongs at the bottom of the page and not the top. Google will only index the first (n) bytes of the page, so clearly this was a bad design choice back in the day.

I assume you got the viewstate move code from here?
http://scottonwriting.net/sowblog/posts/3536.aspx

I expect a simple IndexOf to be faster than any Regex 100% of the time. However, you might try simplifying the regex a little:

&lt;input type="hidden" name="__VIEWSTATE" value="[^"]+" /&gt;

Not sure if you need multiline here. Anyway, I think the overhead of creating the regex once per page is what's hurting you most. Can it be put in the app domain or otherwise removed from the page class?

Also, have you considered viewstate compression in addition to moving it? I can't remember the last results I saw on this..
Thursday, October 13, 2005 4:43:58 AM UTC
No, didn't get the code there, but that VB looks just like my C#.

Ya, zipping also

http://www.hanselman.com/blog/ZippingCompressingViewStateInASPNET.aspx

That RegEx is much better, doh! I was trying to hold it to allowed chars but you're right. I'll test it.

The overhead of the RegEx creation is not a problem AFAIK because the creation happens only once (static readonly, the RegEx is compiled to IL and after that it's just running. Plus, I only use the one instance as Regex instances are threadsafe.
Thursday, October 13, 2005 4:49:12 AM UTC
I just turned off multiline and changed the regex, and the results were nearly identical, and even a tiny tiny bit slower.
Thursday, October 13, 2005 5:12:13 AM UTC
Oh well. I'm not terribly surprised. We unrolled a regex at work recently when validating GUIDs and it CLOBBERED the regex. Like 4x faster! Totally inflexible, and it's a lot of extra code.. but if absolute speed is what you want, you unroll the regex into low-level manual character matching code.

1) Just to humor me, how much worse are the results if you instantiate the regex inside the Render method? I'm curious.

2) You could try increasing the size of the strings used in the 2nd method, because it probably uses Boyer-Moore string matching and that gets faster as the string to be matched gets larger. eg..

"&lt;input type=\"hidden\" name=\"__VIEWSTATE\" value=\"");

"\" \&gt;"
Friday, October 14, 2005 4:36:05 AM UTC
Are you really sure this is a problem? We haven't had any issues with this in our architecture. We currently host a virtualization framework for about 5,000 automotive dealership websites. One of our big selling points is that we offer search engine friendly pages.

I am making a couple of assumptions here, but I have to think that if google can transform pdf/powerpoint files into text and index them correctly, that they have to be ignoring something like hidden form inputs (which the spider doesn't need at all).

I think you would have to have a pretty significant amount of viewstate to cause problems. Compression greatly increases the amount of bytes that the googlebots will grab as well.
Friday, October 14, 2005 3:41:43 PM UTC
At the risk of getting a "well, duh," when are you taking your measurements? The Regex will be instantiated on the initial request, so you should be measuring after the first request to rule out that overhead.
Dave
Tuesday, October 18, 2005 4:33:20 PM UTC
I would recommend checking how DotNetNuke moves the viewstate to the bottom of the page.
Their implementation works very well.

I have modified their existing code to allow for storing the viewstate either in the session or the cache.
Michael
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.