RegEx and nltk Flashcards

Python Regular Expressions and Natural Language Toolkit

1
Q

Import the Python Regular Expression Library

A

import re

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What method is used in Python to find and remove characters?

A

.replace() method.
i.e. text_string.replace(‘.’, ‘ ‘)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the RegEx .join() function do?

A

It takes a string and inputs the characters of that string between the characters/objects of a list.
Syntax: string.join(list)
A “list” is really just a string that regex treats as a list.
i.e. list = “string”
“ “.join(list) = “s t r i n g”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the ‘r’ prefix and why is it important?

A

The ‘r’ prefix when defining a string (i.e. text=r’text’) turns a string into a “raw” string, which tells Python to ignore any escape characters like backslashes, which play an important role with Regular Expressions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What RegEx method can find a pattern and replace it with a defined string?

A

re.sub(pattern, replace_string, string_var)
i.e. re.sub(r’[a-z], ‘ ‘, “Mike”) = “M “

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you negate or use NOT with Regular Expressions

A

[^]
i.e. text = Test123
pattern = [^1-9]
result = ‘‘.join(re.sub(pattern, ‘ ‘, text)
print(result) #Outputs “ 123”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Format a regular expression to identify the hexidecimal codes in a string.

A

pattern = r’[^a-fA-F0-9]+’
‘‘.join(re.sub(pattern, ‘’, org_string)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the function and syntax for making a string lower case?

A

string.lower()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the function syntax in Pandas to apply the lower.() function to a whole column?

A

data[‘col_name’].str.lower()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Identify the Pandas format for removing extra whitespace.

A

data[‘col_name’].str.strip()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly