Regex Flashcards
1
Q
Square Brackets
A
- If we put several characters inside square brackets– […]– this means a choice between characters: str.matches(“[Tt]rue”)
2
Q
Pipe Character
A
- When we want to match alternatives for a whole string, we instead put a pipe character– |– between the alternatives: str.matches(“[Tt]rue|[Yy]es”)
3
Q
Character Ranges
A
- To match any lower case letter, we can write:[a-z] Similarly, to match a digit, we can write:[0-9]
- We can combine single characters and ranges, and/or combine multiple ranges: [a-zA-Z] This matches a lower or upper case letter in the range A-Z.
4
Q
Negation
A
- To say “not in the range…”, we put a hat symbol ^ at the beginning of the character class expression. To say “not a digit”, we can write:[^0-9]
5
Q
Intersection &&
A
- Intersection means “in this class AND in this one”. It is really useful when we combine an intersection with a negation to say “in this class BUT NOT in this one”.
- [0-9&&[^5]] This says a digit except 5
- [a-z&&[^aeiouy]] This says any lower case letter except those representing vowels.
6
Q
Dot .
A
- The dot essentially matches any character. The following expression means “a digit plus any other character”:[0-9].
7
Q
Repetition
A
- Repetition operators are placed ==after== the item to be repeated. The following operators all behave in a similar way: they are placed after the repeated item.
- *: means zero or more repetitions: for example: [0-9]* means any number of digits
- ?: means zero or one repetitions: for example: [0-9]? means an optional digit
- +: means one or more repetitions: for example: [0-9]+ means one or more digits
- {x,y}: means between x and y instances of..: for example: [0-9]{10} means ten digits: .{3} mean three instances of any character
- {x,}: mean at least x instances of…: for example: .{5,} mean at least 5 characters
8
Q
Named Character Classes
A
- For certain common character choices we can put a backslash followed by a character class name.
- To match a single digit, we can write the expression \d.
- So we can now write our ‘has ten characters’ method as follows: str.matches(“.*\d{10}.*”)
- Another useful character class is \s. This matches so-called whitespace: spaces, tabs and line breaks.
- To write \s inside a string literal, we need to double the backslash: “\s”.
9
Q
Pattern() method
A
- We call the static method Pattern.compile(), passing in the expression. This method returns a Pattern object. Hanging off this object is an internal representation of the pattern in a form that makes it efficient to perform matches.
- Pattern patt = Pattern.compile(“.*?[0-9]{10}.*”);*
10
Q
Matcher() method
A
- To check whether a particular string matches the pattern, we call the matcher() method on the Pattern object, passing in the string to match:
- Matcher m = patt.matcher(str);*
11
Q
Usage with Strings
A
- The third step is to call the matches() method on the Matcher we have just created, which returns a boolean indicating whether or not the string passed into the matcher() method matches the regular expression.
public boolean containsTenDigits(String str) {
Pattern patt = Pattern.compile(“.*?[0-9]{10}.*”);
Matcher m = patt.matcher(str);
return m.matches();
}
12
Q
Useful Links
A
- https://www.javamex.com/tutorials/regular_expressions/index.shtml