ADV PROG REGEX Flashcards
is used to turn off (escape) the special meaning of a meta-character or to confer special meaning to other characters.
Backslash ( ), known as the escape character
is a list of values placed inside square braces to match a single character in data.
character class([ ] )
caret ( ^) negates the list inside the brackets
[0-9]
Any digit character
\d
[^0-9]
Any non-digit character
\D
[A-Za-z0-9_]
Any word character
\w
[^A-Za-z0-9_]
Any non-word character
\W
[\t\r\n\f ]
Any whitespace character
\s
[^\t\r\n\f ]
Any non-whitespace character
\S
is either a single character or a group of related characters.
token
used to indicate how many times the previous token should be repeated.
quantifier({ } )
{0,}
Zero or more of the previous token
*
{1,}
One or more of the previous token
+
{0,1}
Zero or one of the previous token
?
used to match the start of a string, except when part of a character class.
caret ( ^ )
used to match the end of a string.
dollar sign ( $ )
used to match a zero-length position before or after a word character, effectively matching whole words only anywhere in a line of text.
boundary anchor ( \b )
used to match any single character, except .
dot (. )
used to make a group of tokens that can be quantified as a unit or to create a list of string values from which to choose.
back references(\1 … \9)
used in a group to create a selection of strings from which to choose, thereby matching either token on the left or right side.
pipe ( | )