NLTK python Flashcards
find occurrence of a word in context in text1
text1.concordance(“word”)
find words that occur in a similar context to a word (spits out a list of words)
text1.concordance(“word”)
explore contexts that are shared by 2 or more words in text1
text1.common_contexts([“worda”, “wordb”])
determine the location of a word or set of words in text1
text1.dispersion_plot([“worda”, “wordb”, “wordc”])
generate text in the style of text5
text5.generate() note nothing goes between ()
find the length of text5
len(text5)
find and display distinct words in a text
set(text)
find and display a sorted list of the distinct types (punctuation), words, and word types (e.g. capitalized) of the text1
sorted(set(text1))
calculate lexical richness of text1 (how often each word is used on avg)
> > > from __future__ import division
|»_space;> len(text3) / len(set(text3))
see how often a word occurs in text3
text3.count(“word”)
see what percentage of text5 is taken up by a specific word
100 * text5.count(“word”) / len(text5)
write a function for lexical diversity
> > > def lexical_diversity(text):
… return len(text) / len(set(text)
write a function to find the percentage of text made up by a word
> > > def percentage(count, total):
… return 100 * count / total
count the occurrences of a word in text1
text1.count(“word”)
find the index where a the word first occurs in text4
text4.index(‘awaken’)