w3 part 2 information extraction vertabi algorithm Flashcards

1
Q

what is the output independence assumption

A

the output independence assumption states the probability of an output observation depends only on the state that produced the observation not past or future observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

problem on that one paper: to determine the squence of hidden variables corresponding to the sequence of observations

find the most probably sequence of states

derive this

prove that argmaxP(t1…|W…) = argmax P(w|T)P(ti(t-i))

A

practice deriving this from the paper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is bayes theorem

A

P(A|B) = P(B|A)*P(A)/ P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

suppose tag DET occures 1000 times and 850 times its followed by noun what is P(NOUN |DET)

A

P(NOUN |DET) = count(noun |det)/ count(det) = 850/1000 = 0.85

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

suppose tag NOUN occures 1000 times and 50 cases it is represented by word ‘bill’ what is P(‘bill’ | NOUN)

A

P(‘bill’ | NOUN) = count(NOUN, ‘bill’)/ count(NOUN) = 50/1000 = 0.05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

can you brute force pos tagging?

A

yes but it quickly gets expensive

ie with 3 observations and 6 states you have 216 combinations already

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the time complexity of the viterbi algorithm vs brute force

A

brute force = O(P^L)

viterbi = O(L x P^2)

where p is pos
and L is length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly