Regular Expressions Tutorial => Word boundaries

Example

The `\b` metacharacter

To make it easier to find whole words, we can use the metacharacter \b. It marks the beginning and the end of an alphanumeric sequence*. Also, since it only serves to mark this locations, it actually matches no character on its own.

*: It is common to call an alphanumeric sequence a word, since we can catch it's characters with a \w (the word characters class). This can be misleading, though, since \w also includes numbers and, in most flavors, the underscore.

Examples:

Regex	Input	Matches?
`\bstack\b`	`stackoverflow`	No, since there's no ocurrence of the whole word `stack`
`\bstack\b`	`foo stack bar`	Yes, since there's nothing before nor after `stack`
`\bstack\b`	`stack!overflow`	Yes: there's nothing before `stack` and `!`is not a word character
`\bstack`	`stackoverflow`	Yes, since there's nothing before `stack`
`overflow\b`	`stackoverflow`	Yes, since there's nothing after `overflow`

The `\B` metacharacter

This is the opposite of \b, matching against the location of every non-boundary character. Like \b, since it matches locations, it matches no character on its own. It is useful for finding non whole words.

Examples:

Regex	Input	Matches?
`\Bb\B`	`abc`	Yes, since `b` is not surrounded by word boundaries.
`\Ba\B`	`abc`	No, `a` has a word boundary on its left side.
`a\B`	`abc`	Yes, `a` does not have a word boundary on its right side.
`\B,\B`	`a,,,b`	Yes, it matches the second comma because `\B` will also match the space between two non-word characters (it should be noted that there is a word boundary to the left of the first comma and to the right of the second).

PDF - Download Regular Expressions for free

Previous Next