Multiple Sequence Alignment Flashcards
what is multiple sequence alignment
alignment of 3+ biological sequences and highlight conserved regions with structural and functional importance
what is a target sequence?
sequence with known structural and functional information used as a reference for the alignment
what is a template sequence
homologous sequences aligned to the target sequence to identify sequence and structural conservation
what can MSA be used for?
- detection of similarities
- detection of conserved motifs
- detection of structural homology and prediction
- identification of phylogenetic relationshipts
what are the 2 methods for MSA
progressive alignment and iterative refinement
what are the progressive alignment tools
ClustalW and T - coffee
what are the iterative refinement tools
Muscle, MAFFT and Dialign
how does progressive alignment work?
starting pair-wise alignment identifies the closest sequence by similarity
then progressively adds the more distant sequences one by one
what are the 2 major shortcomings of progressive alignment
sequences aligned at the beginning are never realigned
early mistakes cannot be corrected
how does iterative alignment work
first sequences are aligned within sub groups
then alignment between groups
what are the advantages of iterative alignment
corrects errors in early alignments
order of subgroups can be chosen at random or using a guide tree
what are the output formats of MSA
FASTA and MSF