HMM Flashcards
Expectation Maximisation
Make initial guess at transition and emission probabilities
Use these to estimate posterior probabilities (Expectation)
Calculate new transition and emission probabilities given state posterior probabilities calculated (Maximisation)
2 and 3 repeated until satisfied with estimation
Pseudocounts
HMM may want to allow transitions which we don’t see much in the training data
HMM structure choice (too few states)
Fail to model significant differences
Miss details
Underfit
HMM structure choice (Too many states)
Model noise
Poor generalisation
Overfit
Primary protein complex
Amino acid sequence
Secondary protein structure
alpha helices
beta sheets/coil
Tertiary protein structure
3d structure of polypeptide
Quaternary protein structure
Protein complex
Difficulties of using amino acid to generate the secondary structure
Non local
Different lengths of input
Representation
Benefits of using amino acid sequence to generate the secondary structure
Now good training data
Can use windows to get fixed input