regex Flashcards
What is regular expression?
A search pattern in python
What does re.search do?
Scans string to look for pattern, returns a (customized) message of if a match is found or not.
word = input(“Enter a word: “)
pattern = ‘as’
if re.search(pattern, word):
print(“The pattern matches the word”)
else:
print(“The pattern doesn’t match the word”)
How do you use re.search to search for more than one thing?
Use a list to seatch
pats = [(‘a..s’, ‘a*s’]
word = input(‘Enter a word: ‘)
for pat in pats:
if re.search(pat, word):
print(“The pattern”, pat, “matches”)
else: print("The pattern", pat, "doesn't match")
‘a..s’
What pattern would this be looking for?
‘a’ (two characters in between) ‘s’
‘a*s’
What pattern would this be looking for?
‘a’ (followed by any number of characters) ‘s’
What does re.findall() do?
The findall() method returns a list containing all non-overlapping matches as strings.
a = “Let’s find the word ‘regex’ using regexes!”
print(a)
re.findall(r’regex’, a)
RESULT:
[‘regex’, ‘regex’]
What is a raw string/Why do we use it in regex?
Raw strings in Python are a way to specify string literals where backslashes () are treated as literal characters rather than escape characters. In a regular string, backslashes are used to escape special characters, such as newline (\n) or tab (\t). However, in raw strings, backslashes are treated as ordinary characters and not as escape characters.
How do you differentiate between a raw string and a regular string?
Put an r in front
regular string
print(“This\t will do a tab space and \nthis will go on a new line\n”)
raw string
print(r”This\t won’t do a tab space and \nthis won’t go on a new line\n”)
What is the difference in the output of re.findall vs re.search?
re.findall() scans the entire string and returns ALL occurances of pattern in a listr form
re.search() scans the ENTIRE string but returns only the first occurence (with specified message) or None
What does re.match() do?
the same thing as re.search() (Scans string to look for pattern, returns a (customized) message of if a match is found or not.) but only looks at the BEGINNING of a string for a match.
b = “123abc”
if re.match(“abc”, b):
print(“abc found with re.match”)
returns:
LITERALLY NONE
if re.search(“abc”, b):
print(“abc found with re.search”)
returns:
“abc found with re.search”
What does re.split() do
splitting based on “!”
Splits text based on given pattern
c = “This is a sentence! re.split will split it based on the pattern! re.sub will replace the pattern”
re.split(r”!”, c)
[‘This is a sentence’,
‘ re.split will split it based on the pattern’,
‘ re.sub will replace the pattern’]
What does re.sub() do?
Replaces based on given match.
re.sub(r”!”, “;”, c)
returns:
‘This is a sentence; re.split will split it based on the pattern; re.sub will replace the pattern’
What is a metacharacter in regex?
In regular expressions (regex), a metacharacter is a character that has a special meaning or function rather than representing itself literally. Metacharacters are used to construct patterns that define search criteria. They enable you to create more flexible and powerful patterns for matching strings.
[abc]
List of characters in square brackets; will match any one of them.
Specific example would match any thing that has an ‘a’, ‘b’, or ‘c’
[^abc]
With a carat in the brackets, this will match any character that is NOT an ‘a’, ‘b’, or ‘c’