Basic Local Alignment Search Tool (BLAST) Flashcards
main NCBI tool for comparing a protein or DNA sequence (query) to other database sequences (target) = reveals related sequences
finds regions of local similarity between sequences having both speed and sensitivity (heuristic local alignment)
Basic Local Alignment Search Tool (BLAST)
BLAST algorithm
- For the query find the list of high scoring words of length w
- Compare the word list to the database and identify exact matches.
- For each word match, extend alignment in both directions to find alignments that score greater than score threshold S.
BLAST: Uses
- Determining what orthologs and paralogs are known for a particular protein or nucleic acid sequence.
- Determining what proteins or genes are present in a particular organism.
- Determining the identity of a DNA or protein sequence.
- Discovering new genes.
- Determining what variants have been described for a particular gene or protein.
- Investigating expressed sequence tags (ESTs) that may exhibit alternative splicing.
- Exploring amino acid residues that are important in a protein’s function and/or structure.
BLAST: 4 Key Steps
- Specifying Sequence of Interest
- Selecting BLAST Program
- Selecting a Database
- a: Selecting Optional Search Parameters
b: Selecting Formatting Parameters
What program to use to compare a protein query to a database of proteins.
BLASTP
What program to use to compare both strands of a DNA query against a DNA database
BLASTN
What program to use to translate a DNA sequence into 6 protein sequences using all 6 possible reading frames, and then compare each of these proteins to a protein database
BLASTX
What program to use to translate every DNA sequence in a database into 6 potential proteins, and then compare the protein query against each of those translated proteins.
TBLASTN
What program is the most computationally intensive BLAST algorithm. It translates DNA from both a query and a database into 6 potential proteins, then performs 36 protein-protein database searches.
TBLASTX
BLAST: Search Statistics
the number of different alignments with scores equivalent to or better than S (similarity score) that are expected to occur by chance in a database search; affected by the scoring system, size of the database, and size of the query
E = expect value
probability of a chance alignment occurring with the score in question or better; closer to zero the better
p-value