Java Language Regular Expressions Using capture groups

Help us to keep this website almost Ad Free! It takes only 10 seconds of your time:
> Step 1: Go view our video on YouTube: EF Core Bulk Extensions
> Step 2: And Like the video. BONUS: You can also share it!

Example

If you need to extract a part of string from the input string, we can use capture groups of regex.

For this example, we'll start with a simple phone number regex:

\d{3}-\d{3}-\d{4}

If parentheses are added to the regex, each set of parentheses is considered a capturing group. In this case, we are using what are called numbered capture groups:

(\d{3})-(\d{3})-(\d{4})
^-----^ ^-----^ ^-----^
Group 1 Group 2 Group 3

Before we can use it in Java, we must not forget to follow the rules of Strings, escaping the backslashes, resulting in the following pattern:

"(\\d{3})-(\\d{3})-(\\d{4})"

We first need to compile the regex pattern to make a Pattern and then we need a Matcher to match our input string with the pattern:

Pattern phonePattern = Pattern.compile("(\\d{3})-(\\d{3})-(\\d{4})");
Matcher phoneMatcher = phonePattern.matcher("abcd800-555-1234wxyz");

Next, the Matcher needs to find the first subsequence that matches the regex:

phoneMatcher.find();

Now, using the group method, we can extract the data from the string:

String number = phoneMatcher.group(0); //"800-555-1234" (Group 0 is everything the regex matched)
String aCode = phoneMatcher.group(1); //"800"
String threeDigit = phoneMatcher.group(2); //"555"
String fourDigit = phoneMatcher.group(3); //"1234"

Note: Matcher.group() can be used in place of Matcher.group(0).

Java SE 7

Java 7 introduced named capture groups. Named capture groups function the same as numbered capture groups (but with a name instead of a number), although there are slight syntax changes. Using named capture groups improves readability.

We can alter the above code to use named groups:

(?<AreaCode>\d{3})-(\d{3})-(\d{4})
^----------------^ ^-----^ ^-----^
AreaCode           Group 2 Group 3

To get the contents of "AreaCode", we can instead use:

String aCode = phoneMatcher.group("AreaCode"); //"800"


Got any Java Language Question?