5. POS Tagging Flashcards
What is POS Tagging?
A Part-Of-Speech Tagger reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc.
eg. I want a ticket: I -> pronoun want -> verb a -> det ticket -> noun
What are Universal POS Tags? Give a few examples of tags.
Universal POS tags are a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Some tags:
- ADJ (adjective), ADV (adverb), NOUN (noun), VERB (verb), PUNCT (punctuation) and others.
What is the naive approach to perform POS Tagging? How does it perform?
Naive Approach:
- Assigning each word its most frequent POS tag, and assigning all unknown words the tag NOUN.
Performs surprisingly well with around 90% accuracy, but there are exception where it does not perform as well.
Create the tables for tag given previews tag and word given tag for POS Tagging given the following corpus:
- John/PROPN is/VERB expected/VERB to/PART race/VERB
- This/DET is/VERB the/DET race/NOUN I/PRON wanted/VERB
- Bring/VERB this/DET to/PART the/DET race/NOUN
Check Lecture Notes 5. POS Tagging slides 17 and 18.
What is the Viterbi algorithm? What’s its complexity?
The Viterbi algorithm is a dynamic programming algorithm for finding the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especially in the context of Markov information sources and hidden Markov models.
Complexity: O(SN^2) where S is the length of the input and N is the number of states in the model.
What is an alternative to Hidden Markov Models (HMM)?
An alternative is a sequence version of logistic regression classifier, the Maximum Entropy Classifier (MEMM), a discriminative model to directly estimate posterior.
Compare HMM and MEMM.
Maybe read this, Luscia’s explanation is disgustingly bad
https://medium.com/@Alibaba_Cloud/hmm-memm-and-crf-a-comparative-analysis-of-statistical-modeling-methods-49fc32a73586