UPDATE: There's more on Internationalized RegExs in this StackOverflow question.
I was trying to make a regular expression for use in client-side JavaScript (using a PeterBlum Validator) that allowed a series of special characters:
-'.,&#@:?!()$\/
Plus letters and numbers and whitespace:
\w\d\s
However, I mistakenly assumed that \w meant truly "word characters." It doesn't, it means [A-Za-z].
That sucks. What about José, when he wants to put his First Name into a form?
Well, I could do a RegEx that denies specific characters and allows all others, but I really just wanted to support Spanish, French, English, German, and any language that uses the general Latin Character Set.
So, here's what I have.
^[ ÀÈÌÒÙ àèìòù ÁÉÍÓÚ Ý áéíóúý ÂÊÎÔÛ âêîôû ÃÑÕ ãñõ ÄËÏÖÜŸ äëïöüŸ ¡¿çÇŒœ ߨøÅå ÆæÞþ Ðð ""\w\d\s-'.,&#@:?!()$\/]+$
Did I miss anything? (Ignore the whitespace for the purposes of this post's RegEx)
It's lame that \w doesn't work on the client-side based on your browser's locale. This makes it difficult for your RegExes to have parity between the client and server.
Scott at DevReach in Bulgaria in October
Developer Stand up Comedy - Coding 4 Fun
TechDays/DevDays Netherlands and Belgium:
Posts by Category Posts by Month
Greatest Hits Dev Tools List