I wanted to split the lines of code displayed into a series of words with any spacing preserved, so started off by splitting each line on any none word “W”.
var words = line.split(/W/g);
This worked fine in Firefox, but when it came to IE the results were not quite what I was expecting. All the words were matched but all white space had been completely ignored, so instead I turned to using “b” as the delimiter.
var words = line.split(/b/g);
This took me a step closer to what was required but it was still not the desired result as I didn’t want any non-word characters to be grouped with alpha-numeric values, such as “123”.
I spent a little – in fact too much -time browsing the web for some pointers and came across an article posted on the SitePoint blog outlining the inconstancies of the String.prototype.split method across different browsers, which seemed explained the problem I was getting earlier but unfortunately offered no resolution.
Now to try and find a solution to all this.
Because splitting on “W” was almost correct I reverted back to using that and now knowing what I had read over at SitePoint I somehow needed to prevent any non-word character sets from being ignored but without affecting the output in any way.
One possibility that arose was to wrap use the special character of null “