\b
metacharacterTo make it easier to find whole words, we can use the metacharacter \b
. It marks the beginning and the end of an alphanumeric sequence*. Also, since it only serves to mark this locations, it actually matches no character on its own.
*: It is common to call an alphanumeric sequence a word, since we can catch it's characters with a \w
(the word characters class). This can be misleading, though, since \w
also includes numbers and, in most flavors, the underscore.
Regex | Input | Matches? |
---|---|---|
\bstack\b | stackoverflow | No, since there's no ocurrence of the whole word stack |
\bstack\b | foo stack bar | Yes, since there's nothing before nor after stack |
\bstack\b | stack!overflow | Yes: there's nothing before stack and ! is not a word character |
\bstack | stackoverflow | Yes, since there's nothing before stack |
overflow\b | stackoverflow | Yes, since there's nothing after overflow |
\B
metacharacterThis is the opposite of \b
, matching against the location of every non-boundary character. Like \b
, since it matches locations, it matches no character on its own. It is useful for finding non whole words.
Regex | Input | Matches? |
---|---|---|
\Bb\B | abc | Yes, since b is not surrounded by word boundaries. |
\Ba\B | abc | No, a has a word boundary on its left side. |
a\B | abc | Yes, a does not have a word boundary on its right side. |
\B,\B | a,,,b | Yes, it matches the second comma because \B will also match the space between two non-word characters (it should be noted that there is a word boundary to the left of the first comma and to the right of the second). |