tcl Mixing Greedy and Non-Greedy Quantifiers


Example

If you have a greedy match as the first quantifier, the whole RE will be greedy,

If you have non-greedy match as the first quantifier, the whole RE will be non-greedy.

set mydata {
    Device widget1: port: 156 alias: input2
    Device widget2: alias: input1 
    Device widget3: port: 238 alias: processor2
    Device widget4: alias: output2
    }
regexp {Device\s(\w+):\s(.*?)alias} $mydata alldata devname devdata
puts "$devname $devdata"
widget1 port: 156 alias: input2
regexp {Device\s(.*?):\s(.*?)alias} $mydata alldata devname devdata
puts "$devname $devdata" 
widget1 port: 156 

In the first case, the first \w+ is greedy, so all quantifiers are marked as greedy and the .*? matches more than is expected.

In the second case, the first .*? is non-greedy and all quantifiers are marked as non-greedy.

Other regular expression engines may not have an issue with greedy/non-greedy quantifiers, but they are much slower.

Henry Spencer wrote: ... The trouble is that it is very, very hard to write a generalization of those statements which covers mixed-greediness regular expressions -- a proper, implementation-independent definition of what mixed-greediness regular expressions should match -- and makes them do "what people expect". I've tried. I'm still trying. No luck so far. ...