Bioinformatics Flashcards
Homologs
Proteins derived from a common ancestor
Orthologs
Homologs from different organisms
Usually similar function
Paralogs
Homologs within same organism
May have similar or different functions
Sliding scales
Proteins are aligned at different positions (sliding past each other) and number of amino acid matches are counted
Problems: splicing differences and mutations
Introducing gaps when comparing homologous proteins
Allows for more matches in sliding scale
Problem: if there are too many small segments, there can be an artificially high number of alignments
Scoring sliding scales
Add up identities (matches) and subtract number of gaps introduced
Testing for significance
Compare sequence to a scrambled sequence
If actual sequence has higher alignment score than “noise,” it is statistically significant
Blosum-62 matrix
Method of scoring substitutions in protein sequences
High points are given if substituted amino acid is similar to original (highest points if no substitution)
Points are taken away if substituted amino acid isn’t like original
Positives versus identities
Positive: amino acids that are related to each other (accounting for everything that aligns)
Identities: identical amino acids
Positives are always greater than or equal to identities
How to find harder to see relationships between proteins
Use an alignment scoring system that accounts for both positives and identities: alignment score should be higher than noise
Most important area of comparison between proteins
Protein folding is most important: amino acid sequence doesn’t always reveal structural similarity (many individual amino acid mutations can be made that don’t change the structure much)