Multiple Alignments Flashcards

1
Q

Hidden Markov Models

A

Observed sequence produced by a number of distinct states. Each state has a different probability distribution
Transition matrix: probability of transition between states. Row is current, Column is next.
T(1,2) proba of state 1 to 2
Emission matrix: probability of each symbol being observed in a sequence being generated by each state. Row is current state, column is outcome
E(1,2) proba of outcome 2 when in state 1

probability of sequence being s state through the position is the multiplication for each position of the index in T corresponding to the current state to final state proba.

total proba of the sequence is (proba of the outcome)^k (proba staying on state or change state)^l
k and m being how many times it happened
Also need to multiply the proba of the start: initial probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Evaluation

A

given the HMM model and the sequence, evaluate the proba of the sequence
FORWARD ALGORITHM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Decoding

A

given the HMM model and the sequence, find the sequence of states that maximize the probability
VITERBI ALGORITHM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Learning

A

given a HMM model and sequence
without transition and emission matrix
find the parameters of the matrices to max the probability
EXPECTATION MAXIMISATION ALGORITHM (EM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

HMM Sequence Analysis

A

Segment sequence in a set of states that have different patterns. Produce a model that generates these.
Can spot different states by watching density plots (change in nucleotide or AT GC)
DNA segmentation

Can find genes: better than sequence alignment. Can detect introns and exons. However more computational intensive

Example
Different distribution patterns can show that some proteins are in a different layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Proba of outcome u when in state s

A

P = proba no transition * p (u in s) + proba transition * p(u in new state)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Genes with HMM

A

sub HMM for different parts of the sequence, introns/exons; non codings dna , etc
use biological knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

VEIL

A

Viterbi Exon Intron Locator
HMM with Upstream Downstream codins, start/stop codon, exon -> splice sites and introns
Submodels

Have 3 columns for each nucleotide in a codon, then stransition between them that lead to codons or stop codons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Profile Hidden Markove Models

A

pHMM
MULTIPLE ALIGNMENT
model aligment of a set of sequences using HMM

diamonds for insertion states
squares for states that are matched/mismatched
circle for deletion states
arrows link everything + arrows to itself for insertion

matched are for when more then half of the sequences have a letter.
Insertion have a letter and more than half have -
Deletion, other have a letter but you have -

Compute transition probabilities: how much times the transition (go through the pHMM with each sequence given) was used vs the others for that state with
P(M1->M2), P(M1->I2), P(M1->D2) with LAPLACE ESTIMATION for no 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Laplace Estimation

A

P(outcome) = (n times outcome happened + 1) / ( number of sample + possible outcomes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Expectation Maximization Algorithm

A

For learning the Transition and Emission matrix of a HMM
initialize h E T
given the sequence s and h, estimate E and T by counting symbols
then estimate E T with viterbi algorithm (VEIL)
Repeat until meeting criterion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly