I have traveled up and down the JavaScript form validation road quite a few times. If it was a toll road – I’d be broke.
A few weeks ago I came across a new bump in that road. I was asked if I could validate a form to ensure only Latin characters were being used. Apparently the processing script on the server side wasn’t playing nice with Chinese characters – or any double-byte characters for that matter.
The solution is pretty simple. Because all Latin characters are contained within 1 byte – you can write JavaScript that will Loop through the all of the characters and check the unicode of each character to ensure it is less than 1 byte. (255).
As per this chart you can see that a unicode value of up to 255 will fall within Latin Basic or Latin-1 Supplement charts.
string.charCodeAt(n); //returns the unicode of character location n within string
Here is a quick and dirty implementation.
function testDB() {
var doubleByteFound = false;
var form = document.forms[0];
for (var i=0; i < form.elements["str"].value.length; i++) {
if (form.elements["str"].value.charCodeAt(i) > 255) {
doubleByteFound = true;
}
}
alert("Double Byte found: " + doubleByteFound);
}
HTML: