Lecture 2 - RH Flashcards
What is BLAST used for?
To search for sequences by typing those sequences in the search bar
What is the output of a BLAST search?
A pairwise alignment
What is found using BLAST searches?
Relatives can be found using BLAST.
BLAST can check whether a protein encodes:
- Known protein with known function
- Known protein unknown function
- Unknown protein known function (familiar functional domains)
- Unknown protein unknown function
What are some uses for BLAST?
Study evolution
Discover function
Find crucial features (motif finding)
Identify causes of disease (detect variable sites in alignment)
What is homology?
Genes or proteins are homologous if they share a common ancestor
What are the types of “ologies” hat can exist between proteins or nucleotides?
Homology (common ancestor)
Orthology (Descent from speciation)
Paralogy (Duplication event causes separate evolution)
Xenology (Horizontal transfer event)
When are proteins said to be homologous?
> 25% identical aas (likely)
18 - 25% twilight zone (needs further investigation)
How is a pairwise alignment interpreted?
Choose 2 sequences
Select algorithm that generates score
Allow gaps (insertions/deletions)
Score reflects similarity
Alignments can be global or local
Probability that the alignment occured by chance is estimated
Which is more informative; protein or DNA sequences?
Protein:
It contains 20 rather than 4 characters
Codons are degenerate
Protein sequences offer a longer “look-back” time
DNA sequences can be converted into protein sequences and used in pairwise alignments
Why are DNA alignments used if protein alignments are more accurate?
To identify cDNA, non-coding regions of DNA, and to identify DNA polymorphisms
What are the types of alignments?
Uniformative (in exercise book)
Ungapped alignment (“”””)
Gapped alignment (“”””)
Why are gapped alignments used?
It is more maleable due to accounting for frameshift mutations
What BLAST tools are available on NCBI?
Nucleotide BLAST
Protein BLAST
Translated BLAST
Genome BLAST
smart BLAST (can be used to create a phylogenetic tree)
What is the difference between global and local alignments?
Global alignment: Aligns all of 2 sequences and finds global similarity
Local alignment: Looks for regions of similarity. (99% of cases)
What is dynamic programming?
Dynamic programming (usually referred to as DP ) is a very powerful technique to solve a particular class of problems. It demands very elegant formulation of the approach and simple thinking and the coding part is very easy.
Why is dynamic programming used in alignments?
Finding optimal alignments is a computationally difficult task so problem is divided into states and for each state a decision is made
What are the 3 steps of dynamic programming for sequence alignments?
Initialization
Scoring the matrix
Traceback
What happens to similarity score when a gap is introduced or extended?
a large penalty is applied to the score
What is a PAM?
Point Accepted Mutation
*Number of changes/100 amino acids
Which matrices are used by BLAST?
BLOSUM matrices
When is BLOCKS used?
For conserved motifs
What does the number after BLOSUM represent?
Percentage identity based on observed alignments
What are the penalties applied by BLOSUM62 matrix?
Opening a new gap = -11
Extending a gap = -1