2013 A2 Flashcards
In the E-M algorithm for parameter estimation of the K -means model, the maximization step
comprises minimizing the distortion. In what way can this sensibly be viewed as a (likelihood)
maximization procedure?
The closer an observation is to a cluster center, the more likely it is that the observation belongs to that cluster. So, by finding a configuration where the distortion (the sum of all the square distances between all observations and their assigned cluster mean) is at a minimum, we find the configuration that maximises the likelihood of each observation being assigned to the cluster that best describes it.
What are the following algorithms used for in the context of hidden Markov models:
◦the forward algorithm?
◦the Viterbi algorithm?
◦the backwards algorithm?
The forward algorithm - Used to determine the likelihood of a given sequence of observations.
The Viterbi algorithm - Used to determine the most likely state sequence
The backwards algorithm - Used to train the HMM (to find the parameter configuration that yields the highest likelihood) to use a “soft” classification approach.
In light of your answer to the previous question, point out a weakness in the classification
approach used in the hidden Markov model assignment for this course.
This course uses a “hard” classification approach. This is very limiting as we cannot model how sure we are that an observation belongs to a certain state.