The first argument of
re.match() is the regular expression, the second is the string to match:
import re pattern = r"123" string = "123zzb" re.match(pattern, string) # Out: <_sre.SRE_Match object; span=(0, 3), match='123'> match = re.match(pattern, string) match.group() # Out: '123'
You may notice that the pattern variable is a string prefixed with
r, which indicates that the string is a raw string literal.
A raw string literal has a slightly different syntax than a string literal, namely a backslash
\ in a raw string literal means "just a backslash" and there's no need for doubling up backlashes to escape "escape sequences" such as newlines (
\n), tabs (
\t), backspaces (
\), form-feeds (
\r), and so on. In normal string literals, each backslash must be doubled up to avoid being taken as the start of an escape sequence.
r"\n" is a string of 2 characters:
n. Regex patterns also use backslashes, e.g.
\d refers to any digit character. We can avoid having to double escape our strings (
"\\d") by using raw strings (
string = "\\t123zzb" # here the backslash is escaped, so there's no tab, just '\' and 't' pattern = "\\t123" # this will match \t (escaping the backslash) followed by 123 re.match(pattern, string).group() # no match re.match(pattern, "\t123zzb").group() # matches '\t123' pattern = r"\\t123" re.match(pattern, string).group() # matches '\\t123'
Matching is done from the start of the string only. If you want to match anywhere use
match = re.match(r"(123)", "a123zzb") match is None # Out: True match = re.search(r"(123)", "a123zzb") match.group() # Out: '123'