Today I will show you the very basics of Regular Expressions and how to use them in Java.
This wonât be a extensive and detailed Regular Expressions tutorial. Regular Expressions are huge topic on their own. I will show you only the basics, so you can quickly start to use regular expressions in your projects. No long explanations, no hastle⦠only a few examples and you are ready to go.
Regular Expressions Methods in String Class
The String class in Java 8 has a few methods using regular expressions (or regex). Â Here is a list of this methods. For more details you can check the String javadoc here.
- boolean String.matches(String regex) â Tells whether or not this string matches the given regular expression
- String replaceAll(String regex, String replacement) â Replaces each substring of this string that matches the given regular expression with the given replacement
- String replaceFirst(String regex, String replacement) â Replaces the first substring of this string that matches the given regular expression with the given replacement
- String[] split(String regex) â Splits this string around matches of the given regular expression
- String[] split(String regex, int limit) â Splits this string around matches of the given regular expression. The pattern is applied âlimitâ number of times
Match a Char
Letâs assume we have following String: âHello, itâs meâ and we want to split the words in it by whitespace.
String str = "Hello, it's me";
String[] words = str.split(" ");
for (String word : words) {
System.out.println(word);
}
this will produce following output
Hello, it's me
But what if we put two whitespaces between âitâsâ and âmeâ like this: âHello, itâs  meâ ?
The output of our program will be:
Hello, it's me
So how can we avoid this? Keep reading and you will find out ð
Repetition
The + sign means match 1 or more in the same row.
Lets go back to our previous example âHello, itâs  meâ with two whitespaces between âitâsâ and âmeâ.
In our code we will change the pattern to â +â. In this case we will match against 1 or more whitespaces in a row
String str = "Hello, it's me";
String[] words = str.split(" +");
for (String word : words) {
System.out.println(word);
}
and the output is:
Hello, it's me
Concatenation
With regular expressions we can match a string or multiple chars in a row giving this as a regex pattern. For example the pattern âingâ will match like this:
âEarning money is easy as counting 1 2 33. Nah!â
In other words if âingâ is provided in our string, the string matches.
Combine Concatenation and Repetition
Pattern: âar+â will match
âarrows are not as fast as bulletsâ
Match zero or more
Pattern: âit*â will match
âitâs fun to learn just sitting in front of my computerâ
The * (asterisk sign) will match âiâ followed by zero or more âtâ. In our example it will match âitâ, âittâ, âiâ, âiâ
Alternation
Pattern: âea|inâ will match
âearning money is easy as counting 1 2 33. Nah!â
The | (vertical bar sign) will match either âeaâ or âinâ.
Character Classes
Pattern: â[123]â will match
âEarning money is easy as counting 1 2 33. Nah!
[] matches any character in the set
Match Ranges
Pattern: â[1-3]â will match
âEarning money is easy as counting 1 2 33. Nah!
Pattern: â[a-f]â will match
âEarning money is easy as counting 1 2 33. Nah!
Excluding a Character
Pattern: â[^a-z123Â ]â will match
âEarning money is easy as counting 1 2 33. Nah!
^ indicates NOT any character in this set
Validate Email Address Regular Expression
I will give you an example how to validate a email address in Java. This is more complex but commonly used regex pattern:
^[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$;
and implementation:
String email = "info@javatutorial.net";
if (email.matches("^[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$;")) {
System.out.println(email + "is a valid e-mail address");
} else {
System.out.println(email + "is invalid e-mail address");
}
Â