Final Test Flashcards

Question 1

Q

Fundamental HMM Assumptions (3)

Answer

A

Observation independence assumption
First-order Markov assumption
Transitions are time-independent

Question 2

Q

Observation independence assumption

Answer

A

Likelihood of the t’th feature vector depends only on the current state, and is therefore otherwise unaffected by previous states and feature vectors.

Question 3

Q

First-order Markov Assumption

Answer

A

Apart from the immediately preceding state, no other previously observed states or features affect the probability of occurrence of the next state.

Question 4

Q

Time-independent transition

Answer

A

We assume the transition probability between two states to be constant irrespective of the time when the transition actually takes place.

Question 5

Q

3 Steps of Viterbi Re-estimation

Answer

A

Initialisation
EM Re-estimation
Termination

Question 6

Q

Viterbi Re-estimation: Initialization: 2 Steps

Answer

A

a) For every training observation sequence, assign in a sensible manner, a corresponding state sequence and extend it by adding the initial and termination state.
b) From this initial state sequence, generate an initial model.

Question 7

Q

Viterbi Re-estimation: EM Re-estimation: 2 Steps

Answer

A

Expectation Step: For the current model estimate, apply the viterbi algorithm on every training sequence to calculate its log-likelihood given S*, the expected state sequence for this observation sequence. Accumulate the “score” to be used later to test for convergence.

Maximization step: Use all the S*’s obtained in the E-step to update the parameters of the HMM.

Question 8

Q

Viterbi Re-estimation: Termination

Answer

A

Compare the total score obtained in the E-step to that from the previous E-step. If it within an acceptable tolerance, terminate, otherwise continue with re-estimation, (step 2).

Question 9

Q

Goal of dimensionality reduction

Answer

A

To project the data onto a lower dimensional space, while retaining some of the essential characteristics of the data.

Question 10

Q

PCA approach to dimensionality reduction

Answer

A

PCA finds lower dimensional subspaces that describe the essential properties of the data, by finding the directions of maximum variation in the data.

Question 11

Q

LDA approach to dimensionality reduction

Answer

A

LDA reduces the dimension of the data values in such a way that maximum class separation is obtained in the lower dimensional space.

Question 12

Q

Dynamic programming algorithm

Answer

A

An algorithm that uses a table to store intermediate values as it builds up the probability of the observation sequence.

Question 13

Q

What does the forward algorithm do?

Answer

A

Computes the observation probability by summing over the probabilities of all possible hidden state paths that could generate the observation sequence, but it does so efficiently by implicitly folding each of these paths into a single forward trellis.

Question 14

Q

HMMs: Decoding task

Answer

A

Given as input an HMM, and a sequence of observations, find the most probable sequence of states.

Question 15

Q

HMMs: Learning task

Answer

A

Given an observation sequence O and the set of possible states in the HMM, learn the HMM parameters.

Question 16

Q

Viterbi algorithm use

Answer

A

To find the optimal sequence of hidden states.
Given an observation sequence and an HMM, the algorithm returns the state path through the HMM that assigns maximum likelihood to the observation sequence.

Question 17

Q

Forward-backward algorithm use

Answer

A

To train the parameters of an HMM, being the transition probability matrix and the observation likelihood matrix.

Question 18

Q

Sequence Model (or Classifier)

Answer

A

A model whose job is to assign a label or class to each unit in a sequence,

thus mapping a sequence of observations to a sequence of labels.

Question 19

Q

HMM

Answer

A

A probabilistic sequence model: given a sequence of units, they compute a probability distribution over possible sequences of labels and choose the best label sequence.

Question 20

Q

Markov chain

Answer

A

A special case of a weighted automaton in which weights are probabilities and in which the input sequence uniquely determines which states the automaton will go through.

Question 21

Q

3 fundamental problems that characterize hidden Markov models

Answer

A

Likelihood: What’s the likelihood of an observation sequence given an HMM.
Decoding: Given an observation sequence and an HMM, what’s the best hidden state sequence.
Learning: Given an observation sequence and the set of states in the HMM, learn the HMM parameters.

Question 22

Q

Algorithms solving the 3 problems of HMMs

Answer

A

Likelihood computation: The Forward Algorithm
Decoding: The Viterbi Algorithm
HMM Training: The Forward-Backward algorithm

Question 23

Q

Backward probability

Answer

A

The probability of seeing the observations from time t+1 to end, given that we are in state i at time t.

Question 24

Q

Contrast discriminative and generative models (TEXTBOOK)

Answer

A

In the case of generative models, a model is trained for each class, totally ignoring the properties of the properties of the other classes.
Discriminative models use all the training data simultaneously to generate the model. It can be used to good effect to try and maximise the differences between classes.

Question 25

Q

Generative Approach (TEXTBOOK)

Answer

A

A model is developed for every class from the observations known to belong to that class.

Once this model is known, it is in principle possible to generate observations for each class.

Question 26

Q

Discriminative Approach (TEXTBOOK)

Answer

A

Directly estimates the posterior from the data. This has the advantage that the data is used to best effect in order to discriminate between the classes.

Thus training data is used more effectively in distinguishing between classes.