Hidden Markov Models Flashcards

Question 1

Q

What are the two sequences in a Hidden Markov Model (HMM)?

Answer

A

State sequence (hidden), also called path

Symbol sequence (observed)

Question 2

Q

What are the two matrices required to represent a Hidden Markov Model?

Answer

A

The emission and transition matrices

Question 3

Q

What are the advantages of pair HMMs over Needleman-Wunsch algorithm?

Answer

A

HMM transition and emission probabilities can be trained, NW substitution scores are fixed.

Question 4

Q

What are the five states in a pair HMM?

Answer

A

Start
M (Match or Mismatch)
X (nucleotide in sequence X aligned against gap in Y)
Y (nucleotide in sequence Y aligned against gap in sequence X)
End

Question 5

Q

How would you calculate the probability of a sequence s, given a Hidden Markov Model?

Answer

A

Using the forward algorithm. The probability of observing s is the sum of the last column in the forward table. This corresponds to sum of probabilities of all possible paths that can create s.

Question 6

Q

Give the formula for an element in the Forward table F_l(i), where l is the state and i is the position in the sequence s

Answer

A

F_l(i) = E_k(s_i) sum_k F_k(i-1) T_{kl}

Where sum_k signifies summing over all states in the previous column.

Question 7

Q

What algorithms can be used to compute the probability of state k for position i given sequence s?

Answer

A

The Forward and Backward algorithms can be used for this type of posterior decoding.

Question 8

Q

Give the formula for an element in the Backward table B_l(i), where l is the state and i is the position in the sequence s

Answer

A

B_l(i) = sum_k E_k(s_{i+1}) T_{lk}B_k(i+1)

Where sum_k signifies summing over all states in the previous column.

Question 9

Q

Give the formula for the probability of of state k for position i given sequence s?

Answer

A

P(\pi_i = k | s) = F_k(i) B_k(i) / P(s)

where F is the forward table
B is the backward table
and P(s) is the probability of s given the HMM

Question 10

Q

What are the differences between Viterbi algorithm and posterior decoding using forward and backward algorithms?

Answer

A

Posterior decoding gives the complete probability distribution of the state at each position. Viterbi only gives the most probable path.

Question 11

Q

Why is it possible, in principle, to predict protein function from the corresponding DNA sequence?

Answer

A

In principle,

DNA sequence determines the amino acid sequence
The amino acid sequence determines the protein folding pattern and 3D structure
The 3D structure determines protein function

Question 12

Q

Explain the idea behind Profile HMM

Answer

A

Proteins with similar functions (families) have similar structures in their amino acid sequence.
We can train a HMM to match a family.
It is then possible to test new protein sequences with this HMM. A high probability score from the forward algorithm signifies that the protein might belong to the family.

Question 13

Q

List the three types of states used in Profile HMMS

Answer

A

Match states
Insertion states
Deletion states

Question 14

Q

What is the purpose of the Viterbi algorithm?

Answer

A

To find the most probable path in a given HMM that created a sequence.

Question 15

Q

Give the formula for an element in the Viterbi table table V_l(i), where l is the state and i is the position in the sequence s

Answer

A

E_k(s_i) max F_k(i-1) T_{kl}

Hidden Markov Models Flashcards

(15 cards)