Hidden Markov Models Flashcards

1
Q

What are the two sequences in a Hidden Markov Model (HMM)?

A

State sequence (hidden), also called path

Symbol sequence (observed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two matrices required to represent a Hidden Markov Model?

A

The emission and transition matrices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the advantages of pair HMMs over Needleman-Wunsch algorithm?

A

HMM transition and emission probabilities can be trained, NW substitution scores are fixed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the five states in a pair HMM?

A

Start
M (Match or Mismatch)
X (nucleotide in sequence X aligned against gap in Y)
Y (nucleotide in sequence Y aligned against gap in sequence X)
End

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How would you calculate the probability of a sequence s, given a Hidden Markov Model?

A

Using the forward algorithm. The probability of observing s is the sum of the last column in the forward table. This corresponds to sum of probabilities of all possible paths that can create s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give the formula for an element in the Forward table F_l(i), where l is the state and i is the position in the sequence s

A

F_l(i) = E_k(s_i) sum_k F_k(i-1) T_{kl}

Where sum_k signifies summing over all states in the previous column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What algorithms can be used to compute the probability of state k for position i given sequence s?

A

The Forward and Backward algorithms can be used for this type of posterior decoding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Give the formula for an element in the Backward table B_l(i), where l is the state and i is the position in the sequence s

A

B_l(i) = sum_k E_k(s_{i+1}) T_{lk}B_k(i+1)

Where sum_k signifies summing over all states in the previous column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give the formula for the probability of of state k for position i given sequence s?

A

P(\pi_i = k | s) = F_k(i) B_k(i) / P(s)

where F is the forward table
B is the backward table
and P(s) is the probability of s given the HMM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the differences between Viterbi algorithm and posterior decoding using forward and backward algorithms?

A

Posterior decoding gives the complete probability distribution of the state at each position. Viterbi only gives the most probable path.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is it possible, in principle, to predict protein function from the corresponding DNA sequence?

A

In principle,

  • DNA sequence determines the amino acid sequence
  • The amino acid sequence determines the protein folding pattern and 3D structure
  • The 3D structure determines protein function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain the idea behind Profile HMM

A

Proteins with similar functions (families) have similar structures in their amino acid sequence.
We can train a HMM to match a family.
It is then possible to test new protein sequences with this HMM. A high probability score from the forward algorithm signifies that the protein might belong to the family.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

List the three types of states used in Profile HMMS

A

Match states
Insertion states
Deletion states

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the purpose of the Viterbi algorithm?

A

To find the most probable path in a given HMM that created a sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Give the formula for an element in the Viterbi table table V_l(i), where l is the state and i is the position in the sequence s

A

E_k(s_i) max F_k(i-1) T_{kl}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly