Special characters and character classes Flashcards
[…]
The bracket expression specifies a character class and matches any single character contained within the brackets or range of characters. Example: [abc] matches a, b and/or c, in any order. A range of characters is specified using the -, for example [a-z] matches any lowercase ASCII letter from a to z. Other examples include [A-F] which matches any uppercase ASCII letter from A to F, and [4-7] which matches any number from 4 to 7. The - character is treated as a literal character if it's listed first, last or escaped: [-] matches -, [a-] matches a and/or -, [a\-z] matches a, - and/or z. Additionally, listed characters can be mixed with ranges of characters. Example: [0-9a-fA-F] matches any number and also letters from a to z irrespective of their case, [02468aeiouy-] matches even numbers, vowels and the - character. Brackets inside bracket expressions are treated as literals if they are escaped. Example: [\[\]] matches [ and/or ]. The [ doesn't need to be escaped if it's listed first: [[] matches [
.
Matches any single character except the newline character. Example: .at matches bat, cat, rat and also .at, 1at
Equivalent to [^\x0A\x0D\u2028\u2029]
[^…]
The negated bracket expression or negated character class matches any single character not contained within the brackets or range of characters. Same as above, except that the ^ negates the expression. Example: [0-9] matches any character that's not a number. Although the ^ character is a special character, it doesn't need to be escaped within the brackets in order to be treated as a literal. Example: [^] matches anything, [^^] matches anything except the ^ character.
\w
Word character
Equivalent to [A-Za-z0-9_]
\W
Non-word character
Equivalent to [^A-Za-z0-9_]
\d
Digit character
Equivalent to [0-9]
\D
Non-digit character
Equivalent to [^0-9]
\s
Whitespace character
Equivalent to [\f\n\r\t\v\u00A0\u2028\u2029] (\u00A0 means “no-break space”, \u2028 means “line separator”, \u2029 means “paragraph separator”)
\S
Non-whitespace character
Equivalent to [^\f\n\r\t\v\u00A0\u2028\u2029]
\b
Word Boundary
Backspace (\x08)
\f
Form-feed (\x0C)
\n
Linefeed or newline (\x0A)
\r
Carriage return (\x0D)
\t
Tab (\x09)
\v
Vertical tab (\x0B)
\0
Null character (\x00)
\xhh
Character with hexadecimal code hh.
\uhhhh
Character with hexadecimal code hhhh.