Scott Hanselman

Internationalization and Classic ASP

July 24, '06 Comments [3] Posted in ASP.NET | Internationalization | XML | Bugs
Sponsored By

If you've ever done work in non-English languages with Classic ASP (ASP3) and gotten black squares in side of the characters you expected, your checklist should be something like this.

Remember: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Classic ASP Internationalization "Don't Lie" Checklist

  • Are your ASP pages saved as UTF-8? I recommend Notepad2 (or debug.exe ;) ) as a good editor that knows what a Unicode Byte-Order-Mark looks like.
  • There's two aspects to encoding with Classic ASP - there's the encoding of the page (the static stuff) and the encoding of the dynamically created content.
    • Add this little-known bit-o-goodness to your pages:
      Response.CodePage = 65001
      Response.CharSet = "utf-8"
  • Make sure that the strings/content you are consuming is also the correct encoding. A very common problem is having Unicode content in an XML file but the prolog might say:
    <?xml version="1.0" encoding="iso-1159-1" ?>.
    This mistake will go unnoticed until José shows up.
    • Make sure your XML encoding matches you actual encoding:
      <?xml version="1.0" encoding="UTF-8" ?>.
  • You might also ensure your Http Headers don't lie:
    Content-Type: text/html; charset=utf-8
  • You might also ensure your META tags don't lie:
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

When all these things line up, things tend to just work. Again, this is old-school stuff, so you likely don't care. Move along, nothing to see here.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Tuesday, July 25, 2006 12:43:49 PM UTC
That's a very opportune reminder in these AJAX days. So many people implementing their own "ajax response generator framework" and many times the character encoding is forgotten. Until, like you said, they put it in production and somebody fetches data with Latin or Eastern characters and the squares and question marks show up.
I get asked about those character mishandlings a lot, mostly from non-ASP.NET developers. Apparently this is easy to forget in frameworks like PHP and Java Struts.
Sergio Pereira
Thursday, July 27, 2006 11:36:37 AM UTC
Great checklist Scott - thank you!

Using classic ASP with ADO and SQL Server should also be using nchar, nvarchar and ntext data types for text columns and stored-procedure paramters otherwise you will be the victim of your machine's default code page.
Thursday, July 27, 2006 2:43:43 PM UTC
One other random little tip; make sure the meta tag appears before any fancy non-ascii content in your web page. This is really important if your users are saving pages to disk. When they go to reopen them, IE no longer has the HTTP header to help it decide what to do. If IE runs into your fancy UTF-8 encoded code point before seeing the meta tag, it sometimes tries to guess at what encoding the document is in, often incorrectly. Make sure the meta tag precedes the non-ascii text, and you're good to go.
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.