Pairwise Sequence Alignment (PSA) Flashcards
share a common evolutionary ancestry
Homology
extent to which two amino acid (or nucleotide) sequences are invariant (unchanged) = exact matching
Identity
general description of a relationship = optimal matching
Similarity
homologous sequences in different species that arose from a common ancestral gene during speciation
Orthologs
homologous sequences that arose by a mechanism such as gene duplication
Paralogs
Scoring Matrices
Perfect match = +1
Mismatch = 0
Gap opening = -2
Gap = -1
Why penalize gaps?
✓ maximizes the number of matches and
✓ minimizes the number of gaps
Protein Sequence Alignment matrices(4)
Identity matrix
Mutation data matrix
Physical properties matrix
Genetic code matrix
Protein sequence alignment matrix
o Exact matches receive one score and non-exact matches a different score (1 on the diagonal 0 everywhere else)
Identity matrix
Protein sequence alignment matrix
o a scoring matrix compiled based on observation of protein mutation rates: some mutations are observed more often than others (PAM, BLOSUM)
Mutation data matrix
Protein sequence alignment matrix
o amino acids with similar biophysical properties receive a high score.
Physical properties matrix
Protein sequence alignment matrix
o amino acids are scored based on similarities in the coding triple
Genetic code matrix
Basis of Scoring Matrices
Accepted Point Mutation (PAM)
Block Substitution Matrix (BLOSUM)
Basis of Scoring Matrices
a replacement of one amino acid in a protein by another residue that has been accepted by natural selection
Accepted Point Mutation (PAM)
Basis of Scoring Matrices
By Henikoff and Henikoff (1992, 1996)
They focused on conserved regions (blocks) of proteins that are distantly related to each other
Block Substitution Matrix (BLOSUM)