Regular Expressions Word Boundary Word boundaries


The \b metacharacter

To make it easier to find whole words, we can use the metacharacter \b. It marks the beginning and the end of an alphanumeric sequence*. Also, since it only serves to mark this locations, it actually matches no character on its own.

*: It is common to call an alphanumeric sequence a word, since we can catch it's characters with a \w (the word characters class). This can be misleading, though, since \w also includes numbers and, in most flavors, the underscore.


\bstack\bstackoverflowNo, since there's no ocurrence of the whole word stack
\bstack\bfoo stack barYes, since there's nothing before nor after stack
\bstack\bstack!overflowYes: there's nothing before stack and !is not a word character
\bstackstackoverflowYes, since there's nothing before stack
overflow\bstackoverflowYes, since there's nothing after overflow

The \B metacharacter

This is the opposite of \b, matching against the location of every non-boundary character. Like \b, since it matches locations, it matches no character on its own. It is useful for finding non whole words.


\Bb\BabcYes, since b is not surrounded by word boundaries.
\Ba\BabcNo, a has a word boundary on its left side.
a\BabcYes, a does not have a word boundary on its right side.
\B,\Ba,,,bYes, it matches the second comma because \B will also match the space between two non-word characters (it should be noted that there is a word boundary to the left of the first comma and to the right of the second).