Machine Learning Flashcards
What is the definition of a hidden Markov model?
A HHM consists of:
- a set of states S
- a transition state probability matrix A
- a state PDF matrix defined for the set of possible observations
The current observation is dependent only on the current state, there is no dependence on previous states or observations
What are the HMM assumptions?
Temporal Independence
Piecewise Stationarity
Random Variability
What is a word level HMM?
Each word is modelled by a dedicated HMM
+ good performance
+ good for small well defined vocabularies
- many examples of each word required for training
- fails to exploit regularities in spoken language
What is a sub-word level HMM?
HMMs are built for a complete set of sub-word building blocks
Word-level HMMs are constructed by concatenating sub-word level HMMs
+ can exploit regularities in speech data
+ efficient use of training data
+ flexible (models can be built for words not included in the training data
What is a discrete HMM?
The state output PDF is defined by a list of probabilities
What are the advantages and disadvantages of a discrete HMM?
+ computational advantages
- vector quantisation may introduce non-recoverable errors
- outperformed by continuous HMMs
What is a continuous HMM?
The state output PDF must be defined for any value in the continuous observation set. Parametric continuous state output PDFs are used.
What is Viterbi Decoding?
An algorithm used to find the sequence of HMM states which are most likely to have produced a given observation
What is the Viterbi decoding formula?
P(Y,X)=b_(x_1 ) (y_t)∏{a_(x_(t-1) x_t ) b_(x_t ) (y_t )}
What are the steps in error back propagation for a neural network?
- Choose initial weights
- Propagate each training sample through the network to obtain an output and calculate the error
- Calculate ∂E/∂w for each weight by propagating the error back up the network
- When all training samples have been seen change the weights by an amount proportional to ∂E/∂w
- Repeat until error falls below the threshold
What is the formula for the change in weights in a neural network?
∆W= -ηδo
What is the formula for delta in a neural network?
(∑wδ_k)o(1-o) for a non-output neuron
(o-t)o(1-o) for an output neuron
What is the Baum-Welch algorithm?
A soft version of the Viterbi estimation
What is the formula for the Baum-Welch algorithm?
μ(i)=(∑γ_t (i) y_t )/(∑γ_t (i) )
γ_t (i)=P(x_t=i|Y)