Lecture 9 Flashcards
Define Bioinformatics (include the fields it encompasses)
Bioinformatics: an interdisciplinary field that uses computational tools for understanding biological data.
Bioinformatics combines biology, computer science, information engineering, math, and statistics to interpret biological data
Define Proteomics and Proteome
Proteomics: The study of proteins
Proteome: the entire set of proteins produced by an organism
State the 2 functions of Ang (angiogenin)
hydrolyzes RNA’s
interacts with DNA causing a promoter-like increase in the expression of rRNA
Describe on a molecular level, how Angiogenin interacts with DNA to enhance the expression of rRNA
it enhances rRNA transcription by binding to the CT-rich angiogenin binding element (ABE) within the upstream intergenic region of rDNA
Define homologs (homologous proteins)
Homologs: 2 molecules that are descended from a common ancestor
Compare Paralogs and Orthologs
Paralogs: homologs present WITHIN ONE SPECIES that have a common origin (a duplication event) but may have evolved different functions
(so paralogs may have similar structure but different functions)
Orthologs: homologs that are present in DIFFERENT species and that have similar functions
(more like identical twins)
Describe the process of sequence alignment and state why it is useful
Sequence Alignment: a process that systemically aligned sequences in order to search for similarities
Sequence comparisons (conducted via sequence alignment) can rule out the possibility that the similarities between samples are due to chance
True or False:
Sequence identities can be established by sliding one sequence past the other and counting the number of matches. explain.
True
while there are now more efficient ways to find these similarities, this method can also find similarities between sequences
(Myoglobin and Alpha-hemoglobin are 25.9% identical and many of these similarities were identified via the “sliding method”)
Introducing “gaps” into one of the sequences has been found to create better alignments between the sequences. What is a common issue with the gap introduction method? how do “scoring systems” account for this issue with the introduction of gaps?
the use of gaps may generate artificial similarities
scoring systems give 10 points for an assigned match between sequences and 25 points are DEDUCTED for a gap
Describe how the statistical significance of alignements between sequences can be estimated by shuffling.
Basically, if you compare the score you get after randomly shuffling the sequences to the score you got from the original alignment, you can determine the if they alignments were due to chance or actually significant
(if the original score is not sufficiently different from the randomized score, the original alignment could be a result of chance)
Describe how distant evolutionary relationships can be detected through the use of the following substitution matrices
More sensitive scoring system:
Conservative substitution:
Non Conservative substitution:
More sensitive scoring system: takes into account the degree of similarity of AA’s
Conservative substitution: replaces one AA with a similar one
Non Conservative substitution: replaces an AA with another AA with different chemical properties
AA substitutions can also be classified by what?
AA substitutions can be classified by the fewest number of nucleotide changes to achieve the AA substitution
Describe the scoring system of a substitution matrix (such as Blossom-62)
Blosum62 is a scoring system that awards points for substitutions that are commonly found in nature and subtracts points for substitutions that rarely occur
What does the substitution matrix reveal about alpha-hemoglobin and myoglobin?
The substitution matrix reveals that many of the differences between alpha-hemoglobin and myoglobin are conservative
True or False:
Substitution matrices can reveal homologies that are not identified by sequence alignments only. explain.
true