Chapter 11: Regex Flashcards
Code that works when the input data is in a particular format but is prone to breakage if there is some deviation from the correct format. Aka easily broken.
brittle code
regex + and * characters expand outward to match the LARGEST possible string
greedy matching
A command available in most Unix systems that searches through text files looking for lines that match regular expressions.
grep
General Regular Expression Parser
A language for expressing more complex search strings. May contain special characters that indicate that a search only matches at the beginning or end of a line or many other similar capabilities.
regular expression
(regex)
A special character that matches any character. In regular expressions it’s the period.
wild card
regular expression module
re
import re
regex method that finds a specified regular expression in text, returns match object
re.search(regex, search string)
regex that matches beginning of line
’^’
re.search(‘^From:’, line)
regex that matches any character (a wildcard)
. (period/full stop)
re.search(‘F..m’, line) = From, Flam, F#om, etc.
regex that applies to the immediately preceding character(s) and indicates to match zero or more times.
*
regex that applies to the immediately preceding character(s) and indicates to match one or more times.
+
regex method that returns a list of substring(s) that matches a regular expression
re.findall(substring, search string)
For loop: [‘substring1’][‘substring2’]
regex that matches non-whitespace character
\S
regex format to accept specific characters
’[]’
Set notation
re.findall(‘[a-zA-Z0-9]’)
regex format to match an actual period
[.] or \.