class 9 Flashcards
what are 3 reasons for computers to do NLP?
- to communicate with humans
- to learn
- to have a better scientific understanding of language and language use
what is a language model?
a probability distribution describing the likelihood of any string
what is grammars purpose?
to define the syntax of legal sentences
what is the purpose of semantic rules?
to define the meaning of the legal sentences
what is the bag-of-words model?
the application of Naive Bayes to a string of words
what is tokenization?
the process of dividing a text into a sequence of words
what is an n-gram model?
use a Markov chain model that considers the dependence between n adjacent words
in what cases would you use n-gram models?
in spam detection, author attribution, and sentiment analysis
what are other alternatives to n-gram models?
character-level models or skip-gram models
what is a structured model that is usually constructed through manual labor?
a dictionary
what’s a common model for POS tagging?
the hidden markov model
HMM [hidden markov model] combined with what algorithm can produce an accuracy of ~97%
Viterbi algorithm
what is the task of assiging a part of speech to each word in a sentence?
part of speech tagging
what is the corpus of over 3M words of text annotated with POS tags?
the Penn Treebank
what are some types of POS tagging?
logistic regression: but uses a greedy search
Viterbi algorithm: slow
beam search: in between logistic and viterbi, keeps accuracy but drops less-likely tags