BLAST Flashcards
What does BLAST mean
basic local alignment search tool
Who developed BLAST?
Altschul and colleagues
algorithm for rapid searching of nucleotide and protein databases
BLAST
employs a heuristic method of pairwise comparison
BLAST
algorithm that estimates the best solution without considering every possible outcome
heuristic method
does not guarantee to find best solutions but finds good solutions
heuristic method
has high speed and is time efficient
heuristic method
homologous sequence are likely to contain a short, high-scoring similarity region
BLAST Strategy
short, high-scoring similarity region
word or hit (W)
designed to find local regions of similarity but can be expected to run about two orders of magnitude faster than the Smith-Waterman algorithm
BLAST
sequence that is subject to comparison
Query
sequences showing similarity with the query sequence
subject (or target)
What are different BLAST programs
blastn
blastp
blastx
tblastn
tblastx
searches a nucleotide database using a nucleotide query
blastn
searches a protein database using a protein query
blastp
searches a protein database using a translated nucleotide query
blastx
searches a translated nucleotide database using a protein query
tblastn
searches a translated nucleotide database using a translated nucleotide query
tblastx
if BLAST found a similar sequence, we must have an idea:
whether alignment is good
whether it portrays possible biological relationship
whether the similarity observed is attributable to chance alone
uses statistical theory to produce a bit score and e-value
BLAST
gives an indication of how good the alignment is
Bit Score
In Bit score, the higher the score, _
the better
key element in the calculation of bit score
substitution matrix
bit score for the aligned region with the highest score
max score
adds the bit scores for all aligned regions
Total score
gives an indication fo the statistical significance of a given pairwise alignment and reflects the size of the database and the scoring system used
Expected value
the lower the E-value, the _ is the hit
more significant
determine the beginning and end positioning of genes in a genome
gene prediction problem
determines the sequence of amino acids in a protein
sequence of codons
What is the paradox of genome
Genome size of many eukaryotes does not correspond to genetic complexity
When was the discovery of split genes through experiments with mRNA of hexon, a viral proteins
1977
When was it discovered that the sequence of codons in a gene determines the sequence of amino acids in a protein
1960s
True or False: the genome of many eukaryotes contain only relatively few genes
True
Problems in Gene Prediction
genome has relatively few genes
many false splice
short exons
long introns
alternative splicing