Regex Flashcards
Study Regex
g
Global match — find all matches rather than only the first one
m
Multi-line match — tells the engine to treat the subject string as multiple lines. ^ and $ match next to \n instead of the start or end of the entire string
i
Ignore case — match both lower and upper case letters
.
Matches any single character except the newline character. Example: .at matches bat, cat, rat and also .at, 1atEquivalent to [^\x0A\x0D\u2028\u2029]
[^…]
The negated bracket expression or negated character class matches any single character not contained within the brackets or range of characters. Same as above, except that the ^ negates the expression. Example: [0-9] matches any character that's not a number.Although the ^ character is a special character, it doesn't need to be escaped within the brackets in order to be treated as a literal. Example: [^] matches anything, [^^] matches anything except the ^ character.
\w
Word character| Equivalent to [A-Za-z0-9_]
\W
Non-word character| Equivalent to [^A-Za-z0-9_]
\d
Digit character| Equivalent to [0-9]
\D
Non-digit character| Equivalent to [^0-9]
\s
Whitespace characterEquivalent to [\f\n\r\t\v\u00A0\u2028\u2029] (\u00A0 means “no-break space”, \u2028 means “line separator”, \u2029 means “paragraph separator”)
\S
Non-whitespace character| Equivalent to [^\f\n\r\t\v\u00A0\u2028\u2029]
\b
Word Boundary| Backspace (\x08)
\f
Form-feed (\x0C)
\n
Linefeed or newline (\x0A)
\r
Carriage return (\x0D)
\t
Tab (\x09)
\v
Vertical tab (\x0B)
\0
Null character (\x00)
\xhh
Character with hexadecimal code hh.
\uhhhh
Character with hexadecimal code hhhh.
[…]
The bracket expression specifies a character class and matches any single character contained within the brackets or range of characters. Example: [abc] matches a, b and/or c, in any order.A range of characters is specified using the -, for example [a-z] matches any lowercase ASCII letter from a to z. Other examples include [A-F] which matches any uppercase ASCII letter from A to F, and [4-7] which matches any number from 4 to 7. The - character is treated as a literal character if it's listed first, last or escaped: [-] matches -, [a-] matches a and/or -, [a\-z] matches a, - and/or z.Additionally, listed characters can be mixed with ranges of characters. Example: [0-9a-fA-F] matches any number and also letters from a to z irrespective of their case, [02468aeiouy-] matches even numbers, vowels and the - character.Brackets inside bracket expressions are treated as literals if they are escaped. Example: [\[\]] matches [ and/or ]. The [ doesn't need to be escaped if it's listed first: [[] matches [
?
Match 0 or 1 times. Example: ab? matches a and ab
+
Match 1 or more times. Example: ab+ matches ab, abb, abbb etc.
{n}
Match exactly n times. Example: ab{2} matches abb
{n,}
Match n or more times. Example: ab{2,} matches abb, abbb, abbbb etc.
{n,m}
Match at least n times, but no more than m times. Example: ab{2,3} matches abb and abbb
??
Match 0 or 1 times, but as few times as possible. Example: ab?? against abbbbb matches a
*?
Match 0 or more times, but as few times as possible. Example: ab*? against abbbbb matches a
+?
Match 1 or more times, but as few times as possible. Example: ab+? against abbbbb matches ab
{n}?
Match n or more times, but as few times as possible. Example: ab{2}? against abbbbb matches abb
{n,m}?
Match at least n times, no more than m times, but as few times as possible. Example: ab{2,3}? against abbbbb matches abb
*
Match 0 or more times. Example: ab* matches a, ab, abb, abbb etc.
(…)
Capturing group - group subpattern and capture the match. Example: (foo)bar matches foobar and captures foo
…|…
Alternation operator - matches one of the alternative subppatterns. Example: foo|bar|baz matches either foo, bar or baz
(?:…)
Non-capturing group - group subpattern, but don’t capture the match. Example: (?:foo)bar matches foobar and doesn’t capture anything
Matches any single character except the newline character.
.
The bracket expression.
[…]
The negated bracket expression.
[^…]
Word character Equivalent to [A-Za-z0-9_]
\w
Non-word characterEquivalent to [^A-Za-Non-word characterEquivalent to [^A-Za-z0-9_]
\W
Digit characterEquivalent to [0-9]
\d
Non-digit characterEquivalent to [^0-9]
\D
Whitespace characterEquivalent to [\f\n\r\t\v\u00A0\u2028\u2029] (\u00A0 means “no-break space”, \u2028 means “line separator”, \u2029 means “paragraph separator”)
\s
Non-whitespace characterEquivalent to [^\f\n\r\t\v\u00A0\u2028\u2029]
\S
Word Boundary| Backspace (\x08)
\b
Form-feed (\x0C)
\f
Linefeed or newline (\x0A)
\n
Carriage return (\x0D)
\r
Tab (\x09)
\t
Vertical tab (\x0B)
\v
Null character (\x00)
\0
Character with hexadecimal code hh.
\xhh
Character with hexadecimal code hhhh.
\uhhhh