Regular Expressions in Python Flashcards

Question 1

Q

r’st\d\s\w\n{3,10}’

Answer

A

r: raw
st: string
d: digit
s: white space
S:non-white space
w:word
W:non-word
n: new line

Question 2

Q

Search the string to see if it starts with “The” and ends with “Spain”:

Answer

A

txt = “The rain in Spain”
x = re.search(“^The.*Spain$”, txt)

Question 3

Q

Print a list of all matches with “ai”

Answer

A

txt = “The rain in Spain”
x = re.findall(“ai”, txt)
print(x)

Question 4

Q

Search for the first white-space character in the string:

Answer

A

txt = “The rain in Spain”
x = re.search(“\s”, txt)

print(“The first white-space character is located in position:”, x.start())

returns: The first white-space character is located in position: 3

Question 5

Q

Split at each white-space character:

txt = “The rain in Spain”

Answer

A

x = re.split(“\s”, txt)
print(x)

returns: [‘The’, ‘rain’, ‘in’, ‘Spain’]

Question 6

Q

Split the string only at the first occurrence:

txt = “The rain in Spain”

Answer

A

x = re.split(“\s”, txt, 1)
print(x)

returns: [‘The’, ‘rain in Spain’]

Question 7

Q

Replace every white-space character with the number 9:
txt = “The rain in Spain”

Answer

A

x = re.sub(“\s”, “9”, txt)
print(x)

returns: The9rain9in9Spain

Question 8

Q

RegEx Functions (4)

Answer

A

findall, search, split, sub

Question 9

Q

findall

Answer

A

Returns a list containing all matches

Question 10

Q

search

Answer

A

Returns a Match object if there is a match anywhere in the string

Question 11

Q

split

Answer

A

Returns a list where the string has been split at each match

Question 12

Q

sub

Answer

A

Replaces one or many matches with a string

Question 13

Q

Extract the substring from the 12th to the 30th character from the variable movie which corresponds to the movie title. Store it in the variable movie_title.
Get the palindrome by reversing the string contained in movie_title.
Complete the code to print out the movie_title if it is a palindrome.

Answer

A

movie_title = movie[11:30]

– Obtain the palindrome

palindrome = movie_title[::-1]

– Print the word if it’s a palindrome

if movie_title == palindrome:
print(movie_title)

Question 14

Q

Convert the string in the variable movie to lowercase. Print the result.

Answer

A

movie_lower = movie.lower()

Question 15

Q

Remove the $ that occur at the start and at the end of the string contained in movie_lower. Print the results.

Answer

A

movie_no_sign = movie_lower.strip(“$”)

Question 16

Q

Split the string contained in movie_no_sign into as many substrings as possible. Print the results.

Answer

A

movie_split = movie_no_sign.split()

Question 17

Q

To get the root of the second word contained in movie_split, select all the characters except the last one.

Answer

A

word_root = movie_split[1][:-1]

Question 18

Q

Remove tag <\i> from the end of the string. Print the results.

Answer

A

movie_tag = movie.strip()

Question 19

Q

Split the string contained in movie_tag using the commas as a separating element. Print the results.

Answer

A

movie_no_comma = movie_tag.split(“,”)

Question 20

Q

Join back together the list of substring contained in movie_no_comma using a space as a join element. Print the results.

Answer

A

movie_join = “ “.join(movie_no_comma)

Question 21

Q

Split the string file into many substrings at line boundaries.
Print out the resulting variable file_split.
Complete the for-loop to split the strings into many substrings using commas as a separator element.

Answer

A

– Split string at line boundaries
file_split = file.splitlines()

– Print file_split
print(file_split)

– Complete for-loop to split by commas

for substring in file_split:
substring_split = substring.split(“,”)
print(substring_split)

Question 22

Q

Find if the substring actor occurs between the characters with index 37 and 41 inclusive. If it is not detected, print the statement Word not found.
Replace actor actor with the substring actor if actor occurs only two repeated times.
Replace actor actor actor with the substring actor if actor appears three repeated times.

Answer

A

for movie in movies:

if movie.find("actor", 37, 42) == -1:
    print("Word not found")


elif movie.count("actor") == 2:  
    print(movie.replace("actor actor", "actor"))
else:


    print(movie.replace("actor actor actor", "actor"))

Question 23

Q

Find the index where money occurs between characters with index 12 and 50. If not found, the method should return -1.

Answer

A

for movie in movies:
print(movie.find(“money”, 12, 51))

Question 24

Q

Find the index where money occurs between characters with index 12 and 50. If not found, it should raise an error.

Answer

A

for movie in movies:
try:

print(movie.index(

“

money

”

, 12, 51))

except ValueError:
print(“substring not found”)

Question 25

Q

my_string1 = “Awesome day”my_string2 = “for biking”

write concatenation to return:
Awesome day for biking

Answer

A

print(my_string1+” “+my_string2)

Question 26

Q

my_string = “Awesome day”

return:
Awe

Answer

A

print(my_string[0:3])

Question 27

Q

my_string = “Awesome day”

return: aweso
return:me day

Answer

A

print(my_string[:5])
print(my_string[5:])

Question 28

Q

my_string = “Awesome day”

return: yad emosewA

Answer

A

print(my_string[::-1])

Question 29

Q

Select the first 32 characters of movie1

Answer

A

first_part = movie1[:32]

Question 30

Q

Select from 43rd character to the end of movie1

Answer

A

last_part = movie1[42:]

Question 31

Q

Select from 33rd to the 42nd character of movie2

Answer

A

middle_part = movie2[32:42]

Question 32

Q

Find out how many characters the variable movie has.

Answer

A

length_string = len(movie)

Question 33

Q

Convert the numeric variable length_string to a string representation.
Then, Concatenate the predefined variable statement and the variable to_string adding a space between them. Print out the result.

Answer

A

to_string = str(length_string)

– Predefined variable

statement = “Number of characters in this review:”

– Concatenate strings and print result

print(statement+” “+ to_string)

Question 34

Q

Select the first 32 characters of the variable movie1 and assign it to the variable first_part.

Answer

A

first_part = movie1[:32]

Question 35

Q

Select the substring going from the 43rd character to the end of movie1. Assign it to the variable last_part.

Answer

A

last_part = movie1[42:]

Question 36

Q

Select the substring going from the 33rd to the 42nd character of movie2. Assign it to the variable middle_part.

Answer

A

middle_part = movie2[32:42]

Question 37

Q

Print the concatenation of the variables first_part, middle_part and last_part in that order.

Answer

A

print(first_part+middle_part+last_part)

Question 38

Q

Question 39

Q

Convert the string in the variable movie to lowercase. Print the result.

Answer

A

movie_lower = movie.lower()
print(movie_lower)

Question 40

Q

find all matches of a pattern

Answer

A

re.findall(r”regex”,string)

Question 41

Q

Remove the $ that occur at the start and at the end of the string contained in movie_lower. Print the results.

Answer

A

movie_no_sign = movie_lower.strip(“$”)
print(movie_no_sign)

Question 42

Q

Split the string contained in movie_no_sign into as many substrings as possible. Print the results.

Answer

A

movie_split = movie_no_sign.split()
print(movie_split)

Question 43

Q

To get the root of the second word contained in movie_split, select all the characters except the last one.

Answer

A

word_root = movie_split[1][:-1]
print(word_root)

Question 44

Q

Remove tag <\i> from the end of the string, movie. Print the results.

Answer

A

movie_tag = movie.rstrip(“<\i>”)

Question 45

Q

what’s rstrip()?

Answer

A

remove trailing characters

Question 46

Q

Join back together the list of substring contained in movie_no_comma using a space as a join element. Print the results.

Answer

A

movie_join = “ “.join(movie_no_comma)
print(movie_join)

Question 47

Q

Find if the substring actor occurs between the characters with index 37 and 41 inclusive. If it is not detected, print the statement Word not found.

Answer

A

for movie in movies:

if movie.find("actor", 37, 42) == -1:
    print("Word not found")

Question 48

Q

Question 49

Q

add elif to for statement to replace actor actor with the substring actor if actor occurs only two repeated times.

for movie in movies:
if movie.find(“actor”, 37, 42) == -1:
print(“Word not found”)

Answer

A

elif movie.count(“actor”) == 2:
print(movie.replace(“actor actor”, “actor”))

Question 50

Q

Find the index where money occurs between characters with index 12 and 50. If not found, the method should return -1.

Answer

A

for movie in movies:
print(movie.find(“money”, 12, 51))

Question 51

Q

Complete a for-loop to split the strings into many substrings using commas as a separator element.

Answer

A

for substring in file_split:
substring_split = substring.split(“,”)

Question 52

Q

Split the string, file ,into many substrings at line boundaries.

Answer

A

file_split = file.splitlines()

Question 53

Q

Import the re module.
Write a regex that matches the user mentions that starts with @ and follows the pattern, e.g. @robot3!.
Find all the matches of the pattern in the sentiment_analysis variable.

Answer

A

Import the re module
import re

Write the regex
regex = r”@robot\d\W”

Find all matches of regex
print(re.findall(regex, sentiment_analysis))

Question 54

Q

Write a regex that matches the number of user mentions given as, for example, User_mentions:9 in sentiment_analysis.

Answer

A

print(re.findall(r”User_mentions:\d”, sentiment_analysis))

Question 55

Q

Write a regex that matches the number of retweets given as, for example, number of retweets: 4 in sentiment_analysis.

Answer

A

print(re.findall(r”number\sof\sretweets:\s\d”, sentiment_analysis))

Question 56

Q

Brainscape's Knowledge GenomeTM

Regular Expressions in Python Flashcards

Brainscape's Knowledge Genome^TM