Algorithms for HMM Flashcards

Question 1

Q

What is the formal definition of the goal of a HMM?

Answer

A

Find the best tagging sequence T for a given untagged sequence S. argmaxTp(T|S)

That is, given a sequence of words, we want to find the sequence of tags that could have created it.

Question 2

Q

How a HMM relates to a FSM

Answer

A

States: the tags
State transition probabilities: p(ti|ti-1)
An output alphabet: the words
Emission probabilities: p(wi|ti)
Initial state: the start of the sentence

Question 3

Q

What does P(S|T)P(T) represent and why are we trying to find it?

Answer

A

P(T|S) = (P(S|T)P(T))/P(S)

P(S|T)P(T) represents the probability of a given tag sequence T producing S.

The tag T that maximises this probability is the most probable tag sequence.

Question 4

Q

If we were to compare the probabilities of different tagging sequences to find the maximum. How many different sequences must we compare if we have c possible tags for a sentence of n words?

Answer

A

c to the power of n

Question 5

Q

Why do we use the Viterbi algorithm?

Answer

A

To reduce the computational cost of comparing every possible tagging sequence to one another.

Question 6

Q

What type of algorithm is the Viterbi algorithm?

Answer

A

Dynamic Programming.

Question 7

Q

Explain Viterbi algorithm to someone

Question 8

Q

Explain Expectation-Maximisation algorithm to someone

Question 9

Q

What other tasks can the Viterbi algorithm be used for besides finding the most probable sequence of tags?

Answer

A

1) (Forward-Algorithm) Compute the likelihood of a sequence of outputs 𝑃(𝑂|𝜆), i.e., the probability of a sentence regardless of tags (a language model!) - marginalise out the states to find the probabilities of the outputs

2) (Forward-Backwards Algorithm) Learn the best set of parameters λ= (A, B) given only an unannotated corpus of sentences.

Question 10

Q

What type of training does a HMM require? (supervised, unsupervised, semi-supervised)

Answer

A

supervised

Question 11

Q

What are the features of a HMM model for POS-tagging?

Answer

A

-States (POS tags)
-State transition probabilities between words and states
-An emission alphabet (represents the words)
-Emission probabilities
-Start and End states

Question 12

Q

What is the EM-algorithm guaranteed to do?

Answer

A

Find a local maximum of the likelihood, so it does not necessarily find the global maximum, it will probably need to be run multiple times.

Question 13

Q

Name a major assumption for HMMs that make it not viable for some contexts

Answer

A

It is sensitive to how good a model is. It has a strong independence assumption, it finds local maximums.

Question 14

Q

How to overcome EM-algorithm local maximums

Answer

A

Early stopping, better initialisation of parameters, random restarts.

Algorithms for HMM Flashcards

(14 cards)