Pairwise Sequence Alignment (PSA) Flashcards
The process of lining up two or more sequences to achieve maximal levels of identity (and conservation, in the case of amino acid sequences) for the purpose of assessing the degree of similarity and the possibility of homology.
Pairwise Sequence Alignment
share a common evolutionary ancestry
Homology
extent to which two amino acid (or nucleotide) sequences are invariant (unchanged) = exact matching
Identity
general description of a relationship = optimal matching
Similarity
Basis of similarity in Proteins
Hydroxylic
Tiny
Small
Acidic
Positive (Basic)
Polar
Charged
Hydrophobic
Aromatic
Sulphur containing
Aliphatic
homologous sequences in different species that arose from a common ancestral gene during speciation
Orthologs
homologous sequences that arose by a mechanism such as gene duplication
Paralogs
Scoring Matrices
Perfect match = +1
Mismatch = 0
Gap opening = -2
Gap = -1
Why penalize gaps?
✓ maximizes the number of matches and
✓ minimizes the number of gaps
Protein Sequence Alignment matrices(4)
Identity matrix
Mutation data matrix
Physical properties matrix
Genetic code matrix
Protein sequence alignment matrix
o Exact matches receive one score and non-exact matches a different score (1 on the diagonal 0 everywhere else)
Identity matrix
Protein sequence alignment matrix
o a scoring matrix compiled based on observation of protein mutation rates: some mutations are observed more often than others (PAM, BLOSUM)
Mutation data matrix
Protein sequence alignment matrix
o amino acids with similar biophysical properties receive a high score.
Physical properties matrix
Protein sequence alignment matrix
o amino acids are scored based on similarities in the coding triple
Genetic code matrix
Basis of Scoring Matrices
Accepted Point Mutation (PAM)
Block Substitution Matrix (BLOSUM)