When you have an input with well defined boundaries and are expecting more than one match in your string, you have two options:
Consider the following:
You have a simple templating engine, you want to replace substrings like $[foo]
where foo
can be any string. You want to replace this substring with whatever based on the part between the []
.
You can try something like \$\[(.*)\]
, and then use the first capture group.
The problem with this is if you have a string like something $[foo] lalala $[bar] something else
your match will be
something $[foo] lalala $[bar] something else
| \______CG1______/|
\_______Match______/
The capture group being foo] lalala $[bar
which may or may not be valid.
You have two solutions
Using laziness: In this case making *
lazy is one way to go about finding the right things. So you change your expression to \$\[(.*?)\]
Using negated character class : [^\]]
you change your expression to \$\[([^\]]*)\]
.
In both solutions, the result will be the same:
something $[foo] lalala $[bar] something else
| \_/| | \_/|
\____/ \____/
With the capture group being respectively foo
and bar
.
Using negated character class reduces backtracking issue and may save your CPU a lot of time when it comes to large inputs.