Shane - Lecture 2 Flashcards
Define homology
Shared ancestry
What are the two ways by which DNA sequences can have shared ancestry?
A speciation event
A duplication event
Define orthologs
(2)
Shared ancestry as a result of a speciation
Similar sequences often retain the same functions over the course of evolution
Define paralogs
(3)
Shared ancestry as a result of a duplication
Genes produced via gene duplication within a genome
Typically evolve new functions or become pseudogenes
What are pseudogenes?
Sequences which have evolved to not produce a functional product
What is our query in protein blast?
A protein
What are we searching for in BLASTp?
A similar protein
When is PSI-BLAST used?
To search for a distantly related protein
A position specific iterated blast
Comment on a result with an E value of 0.005
This is very close to zero
This is a meaningful result - didn’t occur by random
What is the most commonly used scoring matrix?
BLOSUM62
Comment on the use of E values for proteins vs with nucleotides
Don’t need to be as stringent
What percentage similarity does BLOSUM62 find?
Between 30 and 40 %
what does PSI stand for?
Position
Specific
Iterated
What is the E value cut off point for nucleotide sequences?
10x-6
What is the E value cut off point for protein sequences?
10x-3
How is PSI-BLAST different from pBLAST?
(4)
In PSI BLAST you first carry out your blast as normal
Then you repeat the search having selected only the sequences you want to carry forwards
This creates your own custom matric - Position Specific Scoring Matrix
This repeats until you find any new matches
How does using a position specific scoring matrix work?
Uses collective similar characteristics of your selected identified sequences to find new related sequences