w3 part 2 information extraction vertabi algorithm Flashcards
what is the output independence assumption
the output independence assumption states the probability of an output observation depends only on the state that produced the observation not past or future observations
problem on that one paper: to determine the squence of hidden variables corresponding to the sequence of observations
find the most probably sequence of states
derive this
prove that argmaxP(t1…|W…) = argmax P(w|T)P(ti(t-i))
practice deriving this from the paper
what is bayes theorem
P(A|B) = P(B|A)*P(A)/ P(B)
suppose tag DET occures 1000 times and 850 times its followed by noun what is P(NOUN |DET)
P(NOUN |DET) = count(noun |det)/ count(det) = 850/1000 = 0.85
suppose tag NOUN occures 1000 times and 50 cases it is represented by word ‘bill’ what is P(‘bill’ | NOUN)
P(‘bill’ | NOUN) = count(NOUN, ‘bill’)/ count(NOUN) = 50/1000 = 0.05
can you brute force pos tagging?
yes but it quickly gets expensive
ie with 3 observations and 6 states you have 216 combinations already
what is the time complexity of the viterbi algorithm vs brute force
brute force = O(P^L)
viterbi = O(L x P^2)
where p is pos
and L is length