POS Tagging and HMMs Flashcards

Question 1

Q

POS Tagging

Tree Representation

Answer

A

Language hierarchal structure, so can represent with trees
Syntax requires the use of trees (sentences have same shallow analysis)

Question 2

Q

POS Tagging

Answer

A

Different words can have multiple possible POS tags
Different tags produce different sentence meaning

Question 3

Q

POS Tagging

What governs correct POS tag choice?

Answer

A

Word + context
Word identity: most words have <=2 tags, may have just one
Context: nouns start sentences, nouns follow verbs

Question 4

Q

POS Tagging

What is POS tagging good for?

Answer

A

Text-to-speech record, lead
Preprocessesing step for syntatic parsers or other tasks
Very shallow information extraction (isn’t able to get full meaning)

Question 5

Q

HMMs

Hidden Markov Models (HMMs)

Answer

A

Model the sequence of tags y over words x as a Markov process
Uses Markov property

Question 6

Q

HMMs

Markov Property

Answer

A

Future is conditionally independent of the past given the present
The next tag only corresponds on the current tag, not anything before

Question 7

Q

HMMs

Notations

Answer

A

Input x=(x1, …, xn), set of words
xi E V = vocab of words
Output y=(y1, …, yn), set of POS tags
yi E T = set of possible tags (including STOP)

Question 8

Q

HMMs

Equation

Answer

A

P(x,y)=P(y1)(n,product,i=2 -> P(yi|yi-1)) (n,product,i=1 -> P(xi|yi))

Initial Dist: P(y1)
Transition Probs: (n,product,i=2 -> P(yi|yi-1))
Emission Probs: (n,product,i=1 -> P(xi|yi))

Question 9

Q

HMMs

Parameters

Answer

A

Initial Dist: |T|x1 vector (dist over initial states)
Emission Dist: |T|x|V| matrix (dist over words per tag)
Transition Dist: |T|x|T| matrix (dist over next tags per tag)

Question 10

Q

Training HMMs

Maximum Likelyhood Estimation

Answer

A

Read POS tag counts
Normalize counts

Question 11

Q

Training HMMs

Transitions

Answer

A

Count up all pairs (yi, yi+1) in the training data
Count up occurences of what tag T can transition to
Normalize to get distribution for P(next tag|T)
Dist must be smoothed

Question 12

Q

Training HMMs

Emissions

Answer

A

Similar to Transitions, but harder smoothing

Question 13

Q

Training HMMs

Initials

Answer

A

Count up occurences of possible first-positions tags
Perform smoothing

Question 14

Q

HMM Inference

Viterbi Algorithm

Answer

A

Used for inference in HMMS
At test time, for a given sentence, what POS tag sequence should be predicted?
“Think about” all possible immediate prior state values. Everything before that has already been accounted for by earlier stages

Question 15

Q

HMM Inference

Inference Problem

Answer

A

argmax_y P(y|x) = argmax_y P(y,x)

(because P(y,x)/P(x) cancels to P(y,x))
Produces exponentially many possible y

what is solution? …

Question 16

Q

HMM Inference

Inference Problem Solution

Answer

A

Dynamic programming (possible because of Markov structure)

Question 17

Q

HMM POS Tagging

Baseline POS Tagging

Answer

A

Assign each word its most frequent tag (90% accuracy)

Question 18

Q

HMM POS Tagging

HMM-Based POS Taggers

Answer

A

Higher accuracy on both known and unknown/unseen words