Introduction
Regular expression patterns are often used with modifiers (also called flags) that redefine regex behavior. Regex modifiers can be regular (e.g. /abc/i
) and inline (or embedded) (e.g. (?i)abc
). The most common modifiers are global, case-insensitive, multiline and dotall modifiers. However, regex flavors differ in the number of supported regex modifiers and their types.
PCRE Modifiers
Modifier | Inline | Description |
---|
PCRE_CASELESS | (?i) | Case insensitive match |
PCRE_MULTILINE | (?m) | Multiple line matching |
PCRE_DOTALL | (?s) | . matches new lines |
PCRE_ANCHORED | (?A) | Meta-character ^ matches only at the start |
PCRE_EXTENDED | (?x) | White-spaces are ignored |
PCRE_DOLLAR_ENDONLY | n/a | Meta-character $ matches only at the end |
PCRE_EXTRA | (?X) | Strict escape parsing |
PCRE_UTF8 | | Handles UTF-8 characters |
PCRE_UTF16 | | Handles UTF-16 characters |
PCRE_UTF32 | | Handles UTF-32 characters |
PCRE_UNGREEDY | (?U) | Sets the engine to lazy matching |
PCRE_NO_AUTO_CAPTURE | (?:) | Disables auto-capturing groups |
Java Modifiers
Modifier (Pattern.### ) | Value | Description |
---|
UNIX_LINES | 1 | Enables Unix lines mode. |
CASE_INSENSITIVE | 2 | Enables case-insensitive matching. |
COMMENTS | 4 | Permits whitespace and comments in a pattern. |
MULTILINE | 8 | Enables multiline mode. |
LITERAL | 16 | Enables literal parsing of the pattern. |
DOTALL | 32 | Enables dotall mode. |
UNICODE_CASE | 64 | Enables Unicode-aware case folding. |
CANON_EQ | 128 | Enables canonical equivalence. |
UNICODE_CHARACTER_CLASS | 256 | Enables the Unicode version of Predefined character classes and POSIX character classes. |