Character escaping is what allows certain characters (reserved by the regex engine for manipulating searches) to be literally searched for and found in the input string. Escaping depends on context, therefore this example does not cover string or delimiter escaping.
Saying that backslash is the "escape" character is a bit misleading. Backslash escapes and backslash brings; it actually toggles on or off the metacharacter vs. literal status of the character in front of it.
In order to use a literal backslash anywhere in a regex, it must be escaped by another backslash.
There are several characters that need to be escaped to be taken literally (at least outside char classes):
[]
()
{}
*
, +
, ?
, |
^
, $
.
, \
^
at the start or a literal $
at the end of a regex, the character must be escaped.^
and $
as metacharacters when they are at the start or end of the regex respectively. In those flavors, no additional escaping is necessary. It's usually just best to escape them anyway.[
and ]
) when they appear as literals in a char class. Under certain conditions, it's not required, depending on the flavor, but it harms readability.^
, is a meta character when put as the first character in a char class: [^aeiou]
. Anywhere else in the char class, it is just a literal character.-
, is a meta character, unless it's at the beginning or end of a character class. If the first character in the char class is a caret ^
, then it will be a literal if it is the second character in the char class.There are also rules for escaping within the replacement, but none of the rules above apply. The only metacharacters are $
and \
, at least when $
can be used to reference capture groups (like $1
for group 1). To use a literal $
, escape it: \$5.00
. Likewise \
: C:\\Program Files\\
.
While ERE (extended regular expressions) mirrors the typical, Perl-style syntax, BRE (basic regular expressions) has significant differences when it comes to escaping:
\d
, \s
, \w
and so on is gone. Instead, it has its own syntax (which POSIX confusingly calls "character classes"), like [:digit:]
. These constructs must be within a character class..
, *
, ^
, $
) that can be used normally. ALL of the other metacharacters must be escaped differently:Braces {}
a{1,2}
matches a{1,2}
. To match either a
or aa
, use a\{1,2\}
Parentheses ()
(ab)\1
is invalid, since there is no capture group 1. To fix it and match abab
use \(ab\)\1
Backslash
[\d]
matches either \
or d
.Other
+
and ?
are literals. If the BRE engine supports them as metacharacters, they must be escaped as \?
and \+
.