Python Language Regular Expressions (Regex) Matching the beginning of a string

30% OFF - 9th Anniversary discount on Entity Framework Extensions until December 15 with code: ZZZANNIVERSARY9

Example

The first argument of re.match() is the regular expression, the second is the string to match:

import re

pattern = r"123"
string = "123zzb"

re.match(pattern, string)
# Out: <_sre.SRE_Match object; span=(0, 3), match='123'>

match = re.match(pattern, string)

match.group()
# Out: '123'

You may notice that the pattern variable is a string prefixed with r, which indicates that the string is a raw string literal.

A raw string literal has a slightly different syntax than a string literal, namely a backslash \ in a raw string literal means "just a backslash" and there's no need for doubling up backlashes to escape "escape sequences" such as newlines (\n), tabs (\t), backspaces (\), form-feeds (\r), and so on. In normal string literals, each backslash must be doubled up to avoid being taken as the start of an escape sequence.

Hence, r"\n" is a string of 2 characters: \ and n. Regex patterns also use backslashes, e.g. \d refers to any digit character. We can avoid having to double escape our strings ("\\d") by using raw strings (r"\d").

For instance:

string = "\\t123zzb" # here the backslash is escaped, so there's no tab, just '\' and 't'
pattern = "\\t123"   # this will match \t (escaping the backslash) followed by 123
re.match(pattern, string).group()   # no match
re.match(pattern, "\t123zzb").group()  # matches '\t123'

pattern = r"\\t123"  
re.match(pattern, string).group()   # matches '\\t123'

Matching is done from the start of the string only. If you want to match anywhere use re.search instead:

match = re.match(r"(123)", "a123zzb")

match is None
# Out: True

match = re.search(r"(123)", "a123zzb")

match.group()
# Out: '123'


Got any Python Language Question?