Class 10 Flashcards
grammer
defines the syntax of legal sentences
language model
probability distribution describing the likelihood of any string – no pair of people with the exact same language model
tokenization
process of dividing a text into a sequence of words
n gram model
markov chain model that considers only the dependence between n adjacent words, works well for spam detection, sentiment analysis, etc…
character level model
alternative to n-gram model, probability of each character determined by n-1 previous characters
skip gram model
alternative to n-gram model, count words that are near each other but skip a word (or more) between them
smoothing
process of reserving some probability for never seen before n grams
backoff model
estimates n-gram counts, but for low zero counts we back off to (n-1)-grams
linear interpolation smoothing
backoff model that combines trigram, bigram, and unigram models by linear interpolation
wordnet
open source, hand curated dictionary in machine readable format which has proven useful for many natural language applications
penn treebank
corpus of over 3M words of text annotated with part of speech (POS) tags
beam search
compromise between a fast greedy search and a slower, but more accurate Viterbi algorithm
hidden markov model
common model for part of speech (POS) tagging – combined with viterbi algorithm can achieve accuracy of around 97%
discriminative model
learns a conditional probability distribution P(C|W), meaning it can assign categories given a sequence of words but can’t generate random sentences – ex: logistic regression
language
set of sentences that follow the rules laid out by a grammar
syntactic categories
help to constrain the probable words at each point within a sentence – ex: noun phrase or verb phrase
phrase structure
provides framework for meaning or semantics of the sentence
overgenerate
when a grammar produces sentences that are not grammatical
undergenerate
when a grammar rejects valid sentences
lexicon
list of allowable words
parsing
process of analyzing a string of words to uncover its phrase structure according to the rules of grammar
cyk algorithm
chart parser that uses chomsky normal form grammar
shift reduce parsing
popular deterministic approach, go through the sentence word by word choosing at each point whether to shift the word onto a stack of constituents or to reduce the top constituent(s) on the stack according to a grammar rule
dependency grammar
assumes that syntactic structure is formed by binary relations between lexical items, without need for syntactic constituents
unsupervised parsing
approach which learns a new grammar or improves an existing grammar using a corpus of sentences without trees
inside outside algorithm
algorithm that learns to estimate the probabilities in a probabilistic context-free grammar (PCFG) from example sentences without trees
semisupervised learning
type of learning that starts with a small number of trees as data to build an initial grammar and then adds a large number of unparsed sentences to improve the grammar
curriculum learning
type of learning that starts with short (2 word) unambiguous sentences and works its way up to 3,4,5 word sentences
semantics
word used to describe what gives meaning to words
lexicalized pcfg
type of augmented grammar that allows us to assign probabilities based on properties of the words in a phrase other than just the syntactic categories
indexicals
phrases that refer directly to the current situation
lexical ambiguity
when a word has more than 1 meaning
syntactic ambiguity
refers to a phrase that has multiple parses
semantic ambiguity
when a word or phrase has multiple meanings
metonymy
figure of speech in which a word or phrase is replaced by another word or phrase that has a close association or relationship with the original
disambiguation
process of resolving ambiguity or uncertainty in language