Python Flashcards
In Python, data = [6, 8, 10, 12];
put data into an np array?
import np
MyArray = np.array(data)
In Python add dict1 to dict 2?
dict2.update(dict1)
In Python, create a dict that pairs keys and values? (keys and values are variables)
dict(zip(keys, values))
In Python and numpy:
arr_slice = arr[5:8]
arr_slice[1] = 12345
What is the result?
arr[6] equals 12345 (in the original object)
What is a python list comprehension to find all words in TEXT that are longer than 5 characters and appear at least 5 times?
[w for w in set(TEXT) if len(w) > 5 and FreqDist(TEXT)[w] >=5]
With NLTK, when tagging, what is a way to address the trade-off between accuracy and coverage?
Start with a tagger that is more accurate and back-off to a tagger that has greater coverage
In python, how can you reverse a dictionary and why do it?
How:
nltk.Index((value, key) for (key, value) in oldDict.items())
Why:
- Reverse lookup is faster
- dict() won’t deal with multiple values
Using NLTK and the brown corpus, how do you create a conditional frequency distribution based on genre?
ConditionalFreqDist((genre, word)
for genre in brown.categories()
for word in brown.words(categories = genre))
arr = np.array([1, 2, 3], [4, 5, 6])
What does arr[0,2] return?
The same as arr[0][2], which is 3.
arr = np.array([1, 2, 3], [4, 5, 6])
What does arr[0][2] return?
The same as arr[0,2], which is 3.
In iPython, 2 quick ways to time a function?
1) %timeit
2) %time
In Python, return an object’s True or False?
bool(object)
In Python, return the value of ‘a’ from myDict while deleting the key, value pair?
myDict.pop(‘a’)
In Python, load numpy?
import numpy as np
In iPython, look up magic words?
%.[TAB]
In Python’s interpreter, return list of attributes & methods of myObject?
myObject.[TAB]
In, iPython, what does this return?
‘numpy.load?’
All numpy functions that have load as part of the name.
In Python, s = ‘hello’
s[::-1]?
olleh
In Python, what does the following return if x = 30?
‘Yes’ if x > 10 else ‘No’
‘Yes’
In Python, s = ‘hello’
What does the following return?
For i, value in enumerate(s):
print(value)
h e l l o
In Python, return and remove the 3rd item in list T?
T.pop(2)
In Python, s = ‘hello’
What does the following return?
s[2:4]
ll
s[start:up to, but not including]
In Python and np, how do you flip an array Dat with rows and columns?
Dat.T()
In Python and np:
names = array of 7 names data = 7 by 4 d.array
What does the following return?
data[names == “Bob”|names == ‘Will’]
The rows of data at the index of names that equal ‘Bob’ or ‘Will’
In lexical resources, what is the word itself called?
the headword or lemma
In lexical resources, what is the classification as a noun, verb, etc., called?
The part-of-speech or lexical category
In lexical resources, what is the meaning referred to as?
Sense definition or gloss
In Python, tup = (4, 5, (6, 7))
Unpack tup?
a, b, (c, d) = tup
In Python, test if string S has a title capitilization?
S.istitle()
In Python, what module is useful to add items into a sorted list?
bisect
Assuming data is an np nd.array, how many rows & columns?
data.shape
In Python and np, how do you copy an array?
array.copy()
In Python, rewrite the following as a list comprehension?
flattened = []
for tup in some_tuples:
for x in tup:
flattened.append(x)
flattened = [x for tup in some_tuples for x in tup]
In Python, how do you create a list of 1 - 10?
range(1, 11)
In Python, how do you create a list of 10 - 20?
range(10, 21)
In Python, how do you create a list of 10, 12, 14, 16, 18, 20?
range(10, 21, 2)
In nltk, view a graph of Conditional Frequency Distribution called cfd?
cfd.plot()
With pd, dat has columns year and state.
Create a column called myNum that equals 16 for all records?
dat[‘myNum’] = 16
With pd, dat is a list object of multiple dictionaries.
Create a data frame?
DataFrame(dat)
In Python, define a feature extractor for documents that searches a document for a list of words and returns a feature set?
def myFunc(list_of_words, a_document):
- >doc_words = set(a_document) - >to_return = {} - >for words in list_of_words: - >->to_return['contains(%s)' % words] = words in doc_words - >to_return
With pd, what are 2 ways to return a column from a DataFrame?
Attr: dat.name
Dict: dat[‘name’]
With pd, assign -5 to Series object dat at index ‘t’?
dat[‘t’] = -5
With pd, create a series from 4, 7, 2, 1?
Series([4, 7, 2, 1])
Load pandas?
import pandas as pd
from pandas import Series, DataFrame
Assuming this_dat is an np nd-array, what is the type?
this_dat.dtype
With pd, return just Series obj’s values?
obj.values
In pd, 2 ways to use .isnull and .notnull?
Either as a method of an object or a pd function applied to an object.
With pd, return Series object–obj–index?
obj.index
In Python and np, return set difference of x vs. y?
np.setdiff1d(x, y)
In Python, view the conditions associated with a Conditional Frequency Distribution called CFD?
CFD.conditions()
With pd, assign column order of ‘state’ then ‘year’ to dat DataFrame?
DataFrame(dat, columns = [‘state’, ‘year’])
In Python and np, return the intersection of x and y?
np.intersect1d(x, y)
With pd, what happens when you build a DataFrame out of a nested dictionary with 2 levels?
Outer dict keys become the columns and the inner keys are the rows.
In Python and np, dat is an array, return another array of 2s and -2s depending on whether dat’s value is positive?
np.where(dat > 0, 2, -2)
In NLTK, return words that commonly appear around “car” and “bus” in T?
T.common_contexts([‘car’, ‘bus’])
In NLTK, create a plot showing where “car”, “bus”, and “stew” appear in T?
T.dispersion_plot([‘car’, ‘bus’, ‘stew’])
In Python and np, return unique items in x?
np.unique(x)
With pd, dat is a dictionary. What does Series(dat) do?
Creates a Series object with dat’s keys as an index and in sorted order.
In NLTK, after creating 2 lists of words, TEXT1 and TEXT2, how do you find the unique words in TEXT1?
TEXT1.difference(TEXT2)
or
TEXT1 - TEXT2
With NLTK, create a concordance with text object T?
T.concordance(‘myword’)
In Python, what is for loop to return the number and existence of each letter in the string variable STRING?
a_dict = {}
for L in ‘abcdefghijklmnopqrstuv’:
-> a_dict[L] = STRING.lower().count(L)
What are ways to get a list of keys from a Python dictionary, dict?
Treat it like a list and the keys will be the list.
In Python, iterate over unique items in S?
for item in set(S)
In Python, iterate over a unique set of items in S that are not in T?
for item in set(S).difference(T)
In Python, iterate over a random set of items in S?
for item in random.shuffle(S)
With pd, delete column State from dat?
del dat[‘State’]
In Python, how do you write to a file?
with open("file.txt", "w") as f: -> f.write("here is some text")
With NLTK, how do I create a stop-words object that contains a list of english stop-words?
nltk.corpus.stopwords.words(‘english’)
In Python and np, return union of x and y?
np.union1d(x, y)
In NLTK, return words that have a similar concordance with “apple” in T?
T.similar(“apple”)
In Python, add [4, ‘foo’, 3, ‘red’] to list T?
T.extend([4, ‘foo’, 3, ‘red’])
In NLTK, generally describe in code how to create a conditional frequency distribution?
nltk.ConditionalFreqDist(tuple of words like (condition, word))
In Python, use code to confirm x is an integer, returning True or False?
isinstance(x, int)
In Python, add ‘word’ at index 3 to list T, sliding all other values to the right?
T.insert(3, ‘word’)
With NLTK, tag all words in TEXT with ‘n’?
my_tagger = nltk.DefaultTagger(‘n’)
my_tagger.tag(TEXT)
With NLTK, tag all words with tuples of the form (regex, “tag”) stored in PATTERNS?
nltk.RegexpTagger(PATTERNS)
With NLTK, tag all words based on a lookup list stored in TAGS, while backing off to a tagger that tag “Unk” for any word not in the lookup?
nltk.UnigramTagger(model = TAGS, backoff = nltk.DefaultTagger("Unk"))
In Python, add ‘word’ to the end of list T?
T.append(‘word’)
In Python and np, return boolean for items from x that are in y?
np.in1d(x, y)
Create a Python function for creating a measure for the lexical diversity of TEXT?
def lexDef(TEXT): -> return (len(TEXT) / len(set(TEXT))
In Python, how do you continue onto the next line outside parenthesis?
”"
In Python, return the intersection of S1 and S2?
S1.intersection(S2)
or
S1 & S2
In Python, create a table of a conditional frequency distribution call CFD?
CFD.tabulate()
In Python, why are some functions names prefixed by “___”?
They are hidden objects meant to only be used within a module and, as a result, will not be imported when the rest of a library is imported.
In NLTK, how do you create an NLTK corpus from a set of text files?
from nltk import PlaintextCorpusReader
corpus = PlaintextCorpusReader(file_path, REGEX_FOR_FILENAMES)
WordNet: What is a hypernym?
Words that are up the hierarchy (corvette -> car)
WordNet: What is a hyponym?
Words that are down the hierarchy (car -> corvette)
WordNet: What is a meronym?
Components of a word (tree is made up of branch, root, and leaves)
WordNet: What is a holonym?
What a word is contained in (forest includes trees, a forest is a holonym of tree)
WordNet: What does entail mean?
Specific steps in a verb (walking -> stepping)
In Python, after importing os, how do you see your working dir?
os.getcwd()
In NLTK, n-gram functions?
nltk. util.bigram(TEXT)
nltk. util.trigram(TEXT)
nltk. util.ngram(TEXT, n)
In Python:
def search1(subst, words):
- > for word in words:
- > -> if subst in word:
- > ->-> yield word
What and why?
This is a generator and is usually more efficient than building a list to store.
In regex, search for any letter from a to m?
[a-m]
In regex, search for any letter in the word chasm?
[chasm]
In regex search for the word chasm or bank?
‘chasm|bank’
In Python, how do I slice a portion of a STRING that is after the word “START” and up to the word “END”?
STRING[STRING.find(‘START’):STRING.find(‘END’)]
In Python, return the union of SET1 and SET2?
SET1 | SET2
or
SET1.union(SET2)
In Python, test if string S is all numbers?
S.isdigits()
In Python and np, create an array with 0-14?
np.arange(15)
In NLTK, assuming I have a RAW string file, how do I tokenize it?
nltk.word_tokenize(RAW)
With a frequency distribution (fdist), get a word’s percent frequency?
fdist.freq(‘word’)
With a freq. dist. (fdist), get n?
fdist.N()
With a freq. dist. (fdist), get a plot?
fdist.plot()
With a freq. dist. (fdist), get a cumulative plot?
fdist.plot(cumulative = True)
In Python, test if string S’s last letter is “L”?
S.endswith(“L”)
In Python, test if string S is all non-capitalized letters?
S.islower()
In Python, test if string S is composed of letters and numbers?
S.isalnum()
In Python, test if string S is all letters?
S.isalpha()
In Python, test if string S is all capital letters?
S.isupper()
In Python and np, return items in x or y, but not both?
np.setxor1d(x, y)
In regex, what is ‘a*’?
0 or more of ‘a’