For working professionals
For fresh graduates
More
6. JDK in Java
7. C++ Vs Java
16. Java If-else
18. Loops in Java
20. For Loop in Java
46. Packages in Java
53. Java Collection
56. Generics In Java
57. Java Interfaces
60. Streams in Java
63. Thread in Java
67. Deadlock in Java
74. Applet in Java
75. Java Swing
76. Java Frameworks
78. JUnit Testing
81. Jar file in Java
82. Java Clean Code
86. Java 8 features
87. String in Java
93. HashMap in Java
98. Enum in Java
101. Hashcode in Java
105. Linked List in Java
109. Array Length in Java
111. Split in java
112. Map In Java
115. HashSet in Java
118. DateFormat in Java
121. Java List Size
122. Java APIs
128. Identifiers in Java
130. Set in Java
132. Try Catch in Java
133. Bubble Sort in Java
135. Queue in Java
142. Jagged Array in Java
144. Java String Format
145. Replace in Java
146. charAt() in Java
147. CompareTo in Java
151. parseInt in Java
153. Abstraction in Java
154. String Input in Java
156. instanceof in Java
157. Math Floor in Java
158. Selection Sort Java
159. int to char in Java
164. Deque in Java
172. Trim in Java
173. RxJava
174. Recursion in Java
175. HashSet Java
177. Square Root in Java
190. Javafx
Working with text is a common task in programming, and Java provides a powerful tool to simplify it—regular expressions (regex). Whether you need to validate an email, extract phone numbers, or clean up messy data, regex can do it efficiently. In Java, regex is supported through the java.util.regex package, making it easy to perform pattern matching and string manipulation.
In this guide, we’ll explore everything from the basic syntax of regex to advanced techniques like grouping, quantifiers, and lookaheads. You’ll also see real-world examples to understand how regex works in different scenarios.
If you’re looking to strengthen your Java and problem-solving skills, online software engineering courses can help you learn faster and more effectively.
Regular expressions are sequences of characters that define a search pattern. They are used for:
Java programming provides support for regular expressions through the java.util.regex package, which includes the Pattern and Matcher classes.
Advance your career with these proven skill-building programs.
Before diving into Java implementation, let's understand the basic syntax elements of regular expressions:
Character | Description | Example |
Literal characters | Match themselves exactly | a matches "a" |
. | Matches any single character except newline | a.b matches "acb", "adb", etc. |
^ | Matches the start of a line | ^Hello matches "Hello World" but not "World Hello" |
$ | Matches the end of a line | World$ matches "Hello World" but not "World Hello" |
\ | Escapes a special character | \. matches a literal period |
Expression | Description | Example |
[abc] | Matches any one of the characters | [abc] matches "a", "b", or "c" |
[^abc] | Matches any character except those listed | [^abc] matches any character except "a", "b", or "c" |
[a-z] | Matches any character in the range | [a-z] matches any lowercase letter |
\d | Matches any digit (equivalent to [0-9]) | \d\d matches "23", "45", etc. |
\w | Matches any word character (alphanumeric + underscore) | \w+ matches "Java123" |
\s | Matches any whitespace character | a\sb matches "a b" |
Quantifier | Description | Example |
* | Matches 0 or more occurrences | a* matches "", "a", "aa", etc. |
+ | Matches 1 or more occurrences | a+ matches "a", "aa", etc., but not "" |
? | Matches 0 or 1 occurrence | colou?r matches "color" and "colour" |
{n} | Matches exactly n occurrences | a{3} matches "aaa" |
{n,} | Matches n or more occurrences | a{2,} matches "aa", "aaa", etc. |
{n,m} | Matches between n and m occurrences | a{2,4} matches "aa", "aaa", "aaaa" |
Operator | Description | Example |
| | Alternation (OR) operator | cat|dog matches "cat" or "dog" |
() | Grouping expressions | (ab)+ matches "ab", "abab", etc. |
Java provides two main classes for working with regular expressions:
Let's see how to use these classes for various regex operations.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class BasicRegexExample {
public static void main(String[] args) {
// The text to search
String text = "The quick brown fox jumps over the lazy dog";
// Define the pattern to search for
String patternString = "fox";
// Compile the pattern
Pattern pattern = Pattern.compile(patternString);
// Create a matcher for the input text
Matcher matcher = pattern.matcher(text);
// Check if the pattern is found
if (matcher.find()) {
System.out.println("Pattern '" + patternString + "' found at position: " + matcher.start());
} else {
System.out.println("Pattern '" + patternString + "' not found");
}
}
}
Output:
Pattern 'fox' found at position: 16
This example demonstrates how to:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class MultipleMatchesExample {
public static void main(String[] args) {
String text = "The rain in Spain falls mainly on the plain";
// Pattern to find all words containing 'ain'
Pattern pattern = Pattern.compile("\\w*ain\\w*");
Matcher matcher = pattern.matcher(text);
// Find and print all matches
while (matcher.find()) {
System.out.println("Found match: " + matcher.group() +
" at position " + matcher.start() + "-" + (matcher.end() - 1));
}
}
}
Output:
Found match: rain at position 4-7
Found match: Spain at position 13-17
Found match: mainly at position 30-35
Found match: plain at position 46-50
This example shows how to find all occurrences of a pattern in a text using the find() method in a loop.
import java.util.regex.Pattern;
public class EmailValidationExample {
// Regular expression for basic email validation
private static final String EMAIL_REGEX =
"^[a-zA-Z0-9_+&*-]+(?:\\.[a-zA-Z0-9_+&*-]+)*@(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,7}$";
private static final Pattern EMAIL_PATTERN = Pattern.compile(EMAIL_REGEX);
public static boolean isValidEmail(String email) {
if (email == null) {
return false;
}
return EMAIL_PATTERN.matcher(email).matches();
}
public static void main(String[] args) {
// Test email addresses
String[] emails = {
"user@domain.com",
"user.name@domain.com",
"user123@domain.co.uk",
"user@domain", // Invalid
"user@.com", // Invalid
"@domain.com" // Invalid
};
for (String email : emails) {
System.out.println(email + " is " + (isValidEmail(email) ? "valid" : "invalid"));
}
}
}
Output:
user@domain.com is valid
user.name@domain.com is valid
user123@domain.co.uk is valid
user@domain is invalid
user@.com is invalid
@domain.com is invalid
This example demonstrates how to create a regular expression for validating email addresses and check if various strings match the pattern.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexReplacementExample {
public static void main(String[] args) {
String text = "The date is 2023-05-15 and tomorrow is 2023-05-16";
// Pattern to match dates in yyyy-MM-dd format
Pattern pattern = Pattern.compile("\\d{4}-\\d{2}-\\d{2}");
Matcher matcher = pattern.matcher(text);
// Replace all dates with "DATE"
String result = matcher.replaceAll("DATE");
System.out.println("Original: " + text);
System.out.println("After replacement: " + result);
// Using replaceFirst to replace only the first occurrence
matcher.reset();
String firstReplaced = matcher.replaceFirst("DATE");
System.out.println("After replacing first occurrence: " + firstReplaced);
// Using String.replaceAll() directly
String directReplace = text.replaceAll("\\d{4}-\\d{2}-\\d{2}", "DATE");
System.out.println("Using String.replaceAll(): " + directReplace);
}
}
Output:
Original: The date is 2023-05-15 and tomorrow is 2023-05-16
After replacement: The date is DATE and tomorrow is DATE
After replacing first occurrence: The date is DATE and tomorrow is 2023-05-16
Using String.replaceAll(): The date is DATE and tomorrow is DATE
This example shows different ways to replace text using regular expressions, including:
Capturing groups allow you to extract specific parts of the matched text. They are defined using parentheses () in the regular expression.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CapturingGroupsExample {
public static void main(String[] args) {
String text = "John Doe's phone number is 555-123-4567 and Jane Smith's is 555-987-6543";
// Pattern with capturing groups for phone numbers
Pattern pattern = Pattern.compile("(\\d{3})-(\\d{3})-(\\d{4})");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
System.out.println("Area code: " + matcher.group(1));
System.out.println("Exchange code: " + matcher.group(2));
System.out.println("Line number: " + matcher.group(3));
System.out.println();
}
// Using named capturing groups (Java 7+)
Pattern namedPattern = Pattern.compile("(?<areaCode>\\d{3})-(?<exchange>\\d{3})-(?<lineNum>\\d{4})");
Matcher namedMatcher = namedPattern.matcher(text);
System.out.println("Using named groups:");
while (namedMatcher.find()) {
System.out.println("Area code: " + namedMatcher.group("areaCode"));
System.out.println("Exchange: " + namedMatcher.group("exchange"));
System.out.println("Line number: " + namedMatcher.group("lineNum"));
System.out.println();
}
}
}
Output:
Full match: 555-123-4567
Area code: 555
Exchange code: 123
Line number: 4567
Full match: 555-987-6543
Area code: 555
Exchange code: 987
Line number: 6543
Using named groups:
Area code: 555
Exchange: 123
Line number: 4567
Area code: 555
Exchange: 987
Line number: 6543
This example demonstrates:
Java's Pattern class supports several flags that modify how pattern matching works:
Flag | Description |
Pattern.CASE_INSENSITIVE | Makes the pattern case-insensitive |
Pattern.MULTILINE | Changes behavior of ^ and $ to match beginning/end of any line |
Pattern.DOTALL | Makes . match any character including line terminators |
Pattern.UNICODE_CASE | Use Unicode case folding for case-insensitive matching |
Pattern.LITERAL | Treat the pattern as a literal string |
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexFlagsExample {
public static void main(String[] args) {
String text = "Java is a programming language.\nJAVA is widely used.";
// Case-sensitive search (default)
Pattern defaultPattern = Pattern.compile("java");
Matcher defaultMatcher = defaultPattern.matcher(text);
System.out.println("Case-sensitive matches:");
while (defaultMatcher.find()) {
System.out.println("Found at position: " + defaultMatcher.start());
}
// Case-insensitive search
Pattern caseInsensitivePattern = Pattern.compile("java", Pattern.CASE_INSENSITIVE);
Matcher caseInsensitiveMatcher = caseInsensitivePattern.matcher(text);
System.out.println("\nCase-insensitive matches:");
while (caseInsensitiveMatcher.find()) {
System.out.println("Found at position: " + caseInsensitiveMatcher.start());
}
// Using multiple flags
Pattern multiFlags = Pattern.compile("^java",
Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
Matcher multiFlagsMatcher = multiFlags.matcher(text);
System.out.println("\nMulti-flag matches (case-insensitive, multiline):");
while (multiFlagsMatcher.find()) {
System.out.println("Found at position: " + multiFlagsMatcher.start());
}
}
}
Output:
Case-sensitive matches:
Found at position: 0
Case-insensitive matches:
Found at position: 0
Found at position: 33
Multi-flag matches (case-insensitive, multiline):
Found at position: 0
Found at position: 33
This example shows how to use different pattern flags to modify the behavior of a regular expression.
Here are some useful regular expressions for common tasks in Java:
// US phone number (e.g., 555-123-4567 or (555) 123-4567)
String phoneRegex = "^(\\d{3}[-.]?\\d{3}[-.]?\\d{4}|\\(\\d{3}\\)\\s?\\d{3}[-.]?\\d{4})$";
// IPv4 address validation
String ipv4Regex = "^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$";
// Basic URL validation
String urlRegex = "^(https?|ftp)://[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,6}(/\\S*)?$";
// Date in yyyy-MM-dd format
String dateRegex = "^\\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$";
To write effective and maintainable regular expressions in Java:
Regular expressions in Java provide a powerful tool for pattern matching, validation, and text processing. With the java.util.regex package and its Pattern and Matcher classes, you can perform complex text operations efficiently. While regular expressions can seem intimidating at first, understanding the basic syntax and following best practices will help you write effective and maintainable code.
From basic pattern matching to complex validations like email formats, knowledge of regular expressions is an essential skill for Java developers. By combining regex with Java's string handling capabilities, you can solve a wide range of text processing challenges in your applications.
To validate input using regular expressions, compile a Pattern with your regex and then use the matches() method of Matcher. For simple cases, you can also use the String.matches() method, but for repeated validations, pre-compile the pattern for better performance.
The matches() method checks if the entire string matches the pattern, while find() searches for a substring that matches the pattern. For example, "abc".matches("b") returns false, but a Matcher's find() method would return true because "b" is found within "abc".
You can make a regex case-insensitive by adding the (?i) inline flag at the beginning of the pattern, or by using the Pattern.CASE_INSENSITIVE flag when compiling the pattern: Pattern.compile("java", Pattern.CASE_INSENSITIVE).
Use capturing groups (parentheses) in your pattern, and then access the captured groups using matcher.group(groupNumber). Group 0 represents the entire match, and groups 1 and up represent the capturing groups in the order they appear in the pattern.
Common pitfalls include: creating regex patterns that cause catastrophic backtracking, compiling the same pattern repeatedly instead of reusing it, writing overly complex expressions, and using regex for tasks better suited for simple string operations.
Use lookahead assertions to enforce multiple requirements without consuming characters:
// Password must be at least 8 chars, with at least one digit, one lowercase, and one uppercase letter
String passwordRegex = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=\\S+$).{8,}$";
Since a dot (.) is a special metacharacter in regex that matches any character, you need to escape it with a backslash to match a literal dot: \\.. In a Java string, the backslash itself needs to be escaped, so you'd write it as "\\."
While you can use regex for simple HTML/XML parsing tasks, it's generally not recommended for parsing complex HTML/XML documents. Use proper parsers like JAXP, DOM, SAX, or libraries like Jsoup for HTML parsing.
Use a Matcher's find() method in a loop and increment a counter for each match:
Pattern pattern = Pattern.compile("\\w+");
Matcher matcher = pattern.matcher(text);
int count = 0;
while (matcher.find()) {
count++;
}
Java's standard regex API doesn't provide a built-in method to get all matches at once, but you can easily collect them in a List:
Pattern pattern = Pattern.compile("\\w+");
Matcher matcher = pattern.matcher(text);
List<String> matches = new ArrayList<>();
while (matcher.find()) {
matches.add(matcher.group());
}
Use the String.split(String regex) method to split a string using a regular expression as the delimiter:
String text = "apple,banana;orange,grape";
String[] fruits = text.split("[,;]"); // Split by comma or semicolon
Take the Free Quiz on Java
Answer quick questions and assess your Java knowledge
Author|900 articles published
Previous
Next
Talk to our experts. We are available 7 days a week, 9 AM to 12 AM (midnight)
Indian Nationals
1800 210 2020
Foreign Nationals
+918068792934
1.The above statistics depend on various factors and individual results may vary. Past performance is no guarantee of future results.
2.The student assumes full responsibility for all expenses associated with visas, travel, & related costs. upGrad does not provide any a.