Thursday, March 22, 2012

String replace the better way.

This last week I have been working on making my team project at work run better on ie 7.  Usually that would mean that I spent all my time fixing styles, but there are better people then me at that.  So for me the project was getting rid of the scrpt is running two long errors.  The app I work on is very JavaScript heavy.  One of the worst offending functions I had found was the following:

function clean(word) {
        if (word) {
            
            return word.toLowerCase().
                replace(/\s/gi, "").                
                replace(/[àáâãäå]/gi, "a").
                replace(/æ/gi, "ae").
                replace(/ç/gi, "c").
                replace(/[èéêë]/gi, "e").
                replace(/[ìíîï]/gi, "i").
                replace(/ñ/gi, "n").
                replace(/[òóôõö]/gi, "o").
                replace(/œ/gi, "oe").
                replace(/[ùúûü]/gi, "u").
                replace(/[ýÿ]/gi, "y").
                replace(/\W/gi, "");
        } else {
            return word;
        }
    }

Which is a very brute force way of cleaning up strings for searching.  It was the most expensive call in our system.  With a little work it was changed to the following. 

var removeAccents = (function () {
        var translateReg = /[àáâãäåæçèéêëìíîïñòóôõöœùúûüýÿ]/g;
        var translate = {
            "à""a""á""a""â""a""ã""a""ä""a",
            "å""a""æ""ae""ç""ç""è""e""é""e",
            "ê""e""ë""e""ì""i""í""i""î""i",
            "ï""i""ñ""n""ò""o""ó""o""ô""o",
            "õ""o""ö""o""œ""oe""ù""u""ú""u",
            "û""u""ü""u""ý""y""ÿ""y"
        };
        return function(s) {
            return (s.replace(translateReg, function(match) {
                return translate[match];
            }));
        };
    })();
 
    function clean(word) {
        if (word) {
            return removeAccents(word.toLowerCase().replace(/\s/gi, "").replace(/\W/gi, ""));
        } else {
            return word;
        }
    }

For a massive boost in performance.




4 comments:

Ishpeck said...

Wait, you got a massive speed boost _and_ a performance bump at the same time? Together? Simultaneously? NO WAY!

Erin said...

Thanks for pointing out my error. Naomi pointed out an actual bug.

Unknown said...

Love your change to use the module pattern. Could have made it more "objecty" looking like this:


var AccentRemover = (funciton() {
// translation tables go here
...
function removeAccents(s) {
return s.replace ... // your anonymous function body goes here
}
return {
removeAccents: removeAccents
};
})();


Then you would call it like this:


var result = AccentRemover.removeAccents('some string');


I've become quite fond of the module pattern in general, and this naming convention in specific in my recent work. I like the "AccentRemover" naming beacause it communicates intent that this is a module possibly with its own state and its own scope.

Sorry for the lack of indenting. Stupid blogger.com.

Also, have you looked into qunit for unit testing code like this? I have liked it.

Erin said...

It's actually an internal function in a trie object so that would be over kill.