Regular Expressions in Python Flashcards

1
Q

r’st\d\s\w\n{3,10}’

A

r: raw
st: string
d: digit
s: white space
S:non-white space
w:word
W:non-word
n: new line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Search the string to see if it starts with “The” and ends with “Spain”:

A

txt = “The rain in Spain”
x = re.search(“^The.*Spain$”, txt)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Print a list of all matches with “ai”

A

txt = “The rain in Spain”
x = re.findall(“ai”, txt)
print(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Search for the first white-space character in the string:

A

txt = “The rain in Spain”
x = re.search(“\s”, txt)

print(“The first white-space character is located in position:”, x.start())

returns: The first white-space character is located in position: 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Split at each white-space character:

txt = “The rain in Spain”

A

x = re.split(“\s”, txt)
print(x)

returns: [‘The’, ‘rain’, ‘in’, ‘Spain’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Split the string only at the first occurrence:

txt = “The rain in Spain”

A

x = re.split(“\s”, txt, 1)
print(x)

returns: [‘The’, ‘rain in Spain’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Replace every white-space character with the number 9:
txt = “The rain in Spain”

A

x = re.sub(“\s”, “9”, txt)
print(x)

returns: The9rain9in9Spain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

RegEx Functions (4)

A

findall, search, split, sub

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

findall

A

Returns a list containing all matches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

search

A

Returns a Match object if there is a match anywhere in the string

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

split

A

Returns a list where the string has been split at each match

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

sub

A

Replaces one or many matches with a string

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Extract the substring from the 12th to the 30th character from the variable movie which corresponds to the movie title. Store it in the variable movie_title.
Get the palindrome by reversing the string contained in movie_title.
Complete the code to print out the movie_title if it is a palindrome.

A

movie_title = movie[11:30]

– Obtain the palindrome

palindrome = movie_title[::-1]

– Print the word if it’s a palindrome

if movie_title == palindrome:
print(movie_title)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Convert the string in the variable movie to lowercase. Print the result.

A

movie_lower = movie.lower()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Remove the $ that occur at the start and at the end of the string contained in movie_lower. Print the results.

A

movie_no_sign = movie_lower.strip(“$”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Split the string contained in movie_no_sign into as many substrings as possible. Print the results.

A

movie_split = movie_no_sign.split()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

To get the root of the second word contained in movie_split, select all the characters except the last one.

A

word_root = movie_split[1][:-1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Remove tag <\i> from the end of the string. Print the results.

A

movie_tag = movie.strip()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Split the string contained in movie_tag using the commas as a separating element. Print the results.

A

movie_no_comma = movie_tag.split(“,”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Join back together the list of substring contained in movie_no_comma using a space as a join element. Print the results.

A

movie_join = “ “.join(movie_no_comma)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Split the string file into many substrings at line boundaries.
Print out the resulting variable file_split.
Complete the for-loop to split the strings into many substrings using commas as a separator element.

A

– Split string at line boundaries
file_split = file.splitlines()

– Print file_split
print(file_split)

– Complete for-loop to split by commas

for substring in file_split:
substring_split = substring.split(“,”)
print(substring_split)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Find if the substring actor occurs between the characters with index 37 and 41 inclusive. If it is not detected, print the statement Word not found.
Replace actor actor with the substring actor if actor occurs only two repeated times.
Replace actor actor actor with the substring actor if actor appears three repeated times.

A

for movie in movies:

if movie.find("actor", 37, 42) == -1:
    print("Word not found")


elif movie.count("actor") == 2:  
    print(movie.replace("actor actor", "actor"))
else:


    print(movie.replace("actor actor actor", "actor"))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Find the index where money occurs between characters with index 12 and 50. If not found, the method should return -1.

A

for movie in movies:
print(movie.find(“money”, 12, 51))

24
Q

Find the index where money occurs between characters with index 12 and 50. If not found, it should raise an error.

A

for movie in movies:
try:

print(movie.index(

money

, 12, 51))

except ValueError:
print(“substring not found”)

25
Q

my_string1 = “Awesome day”my_string2 = “for biking”

write concatenation to return:
Awesome day for biking

A

print(my_string1+” “+my_string2)

26
Q

my_string = “Awesome day”

return:
Awe

A

print(my_string[0:3])

27
Q

my_string = “Awesome day”

return: aweso
return:me day

A

print(my_string[:5])
print(my_string[5:])

28
Q

my_string = “Awesome day”

return: yad emosewA

A

print(my_string[::-1])

29
Q

Select the first 32 characters of movie1

A

first_part = movie1[:32]

30
Q

Select from 43rd character to the end of movie1

A

last_part = movie1[42:]

31
Q

Select from 33rd to the 42nd character of movie2

A

middle_part = movie2[32:42]

32
Q

Find out how many characters the variable movie has.

A

length_string = len(movie)

33
Q

Convert the numeric variable length_string to a string representation.
Then, Concatenate the predefined variable statement and the variable to_string adding a space between them. Print out the result.

A

to_string = str(length_string)

– Predefined variable

statement = “Number of characters in this review:”

– Concatenate strings and print result

print(statement+” “+ to_string)

34
Q

Select the first 32 characters of the variable movie1 and assign it to the variable first_part.

A

first_part = movie1[:32]

35
Q

Select the substring going from the 43rd character to the end of movie1. Assign it to the variable last_part.

A

last_part = movie1[42:]

36
Q

Select the substring going from the 33rd to the 42nd character of movie2. Assign it to the variable middle_part.

A

middle_part = movie2[32:42]

37
Q

Print the concatenation of the variables first_part, middle_part and last_part in that order.

A

print(first_part+middle_part+last_part)

38
Q
A
39
Q

Convert the string in the variable movie to lowercase. Print the result.

A

movie_lower = movie.lower()
print(movie_lower)

40
Q

find all matches of a pattern

A

re.findall(r”regex”,string)

41
Q

Remove the $ that occur at the start and at the end of the string contained in movie_lower. Print the results.

A

movie_no_sign = movie_lower.strip(“$”)
print(movie_no_sign)

42
Q

Split the string contained in movie_no_sign into as many substrings as possible. Print the results.

A

movie_split = movie_no_sign.split()
print(movie_split)

43
Q

To get the root of the second word contained in movie_split, select all the characters except the last one.

A

word_root = movie_split[1][:-1]
print(word_root)

44
Q

Remove tag <\i> from the end of the string, movie. Print the results.

A

movie_tag = movie.rstrip(“<\i>”)

45
Q

what’s rstrip()?

A

remove trailing characters

46
Q

Join back together the list of substring contained in movie_no_comma using a space as a join element. Print the results.

A

movie_join = “ “.join(movie_no_comma)
print(movie_join)

47
Q

Find if the substring actor occurs between the characters with index 37 and 41 inclusive. If it is not detected, print the statement Word not found.

A

for movie in movies:

if movie.find("actor", 37, 42) == -1:
    print("Word not found")
48
Q
A
49
Q

add elif to for statement to replace actor actor with the substring actor if actor occurs only two repeated times.

for movie in movies:
if movie.find(“actor”, 37, 42) == -1:
print(“Word not found”)

A

elif movie.count(“actor”) == 2:
print(movie.replace(“actor actor”, “actor”))

50
Q

Find the index where money occurs between characters with index 12 and 50. If not found, the method should return -1.

A

for movie in movies:
print(movie.find(“money”, 12, 51))

51
Q

Complete a for-loop to split the strings into many substrings using commas as a separator element.

A

for substring in file_split:
substring_split = substring.split(“,”)

52
Q

Split the string, file ,into many substrings at line boundaries.

A

file_split = file.splitlines()

53
Q

Import the re module.
Write a regex that matches the user mentions that starts with @ and follows the pattern, e.g. @robot3!.
Find all the matches of the pattern in the sentiment_analysis variable.

A

Import the re module
import re

Write the regex
regex = r”@robot\d\W”

Find all matches of regex
print(re.findall(regex, sentiment_analysis))

54
Q

Write a regex that matches the number of user mentions given as, for example, User_mentions:9 in sentiment_analysis.

A

print(re.findall(r”User_mentions:\d”, sentiment_analysis))

55
Q

Write a regex that matches the number of retweets given as, for example, number of retweets: 4 in sentiment_analysis.

A

print(re.findall(r”number\sof\sretweets:\s\d”, sentiment_analysis))

56
Q
A