Regular Expressions Simple string operations


Because Regular Expressions can do a lot, it is tempting to use them for the simplest operations. But using a regex engine has a cost in memory and processor usage: you need to compile the expression, store the automaton in memory, initialize it and then feed it with the string to run it.

And there are many cases where it's just not necessary to use it! Whatever your language of choice is, it always has the basic string manipulation tools. So, as a rule, when there's a tool to do an action in your standard library, use that tool, not a regex:

  • split a string?

For example the following snippet works in Python, Ruby and Javascript:


Which is easier to read and understand, as well as much more efficient than the (somehow) equivalent regular expression:

  • Strip trailing spaces?

The same applies to trailing spaces!

'foobar     '.strip() # python or ruby
'foobar     '.trim() // javascript

Which would be equivalent to the following expression:

([^\n]*)\s*$ # keeping \1 in the substitution