L3 Flashcards
Retrieval of biological sequences in databases is based on what?
Similarity
Searching biological sequence databases involves?
Submission of a query sequence and performing a pairwise comparison query with all individual sequences in a database
Requirements for implementing algorithms for sequence database searching include
- sensitivity
- selectivity
- speed
Sensitivity
Refers to the ability to find as many correct hits as possible. The correct hits are considered true positives
Selectivity
also called specificity, which refers to the ability to exclude incorrect hits. These
incorrect hits are considered “false positives.”
Speed
which is the time it takes to get results from database searches
An increase in sensitivity leads to
a decrease in selectivity
an increase in speed leads to
a decrease in sensitivity and selectivity
What are the types of algorithms in database searching
- exhaustive
- heuristic
Exhaustive algorithm
makes use of a rigorous algorithm to find the best or exact solution for a particular problem by examining all mathematical combinations
Heuristic algorithm
a computational strategy to find the near optimal solution
How do heuristic algorithms take shortcuts
by reducing space according to some criteria
what are the methods used to infer sequence similarity
Global and Local alignment
Local alignment
Finds domains and short regions of similarity between a pair of sequences eg
-looking for domains within proteins
-looking for regions of genomic DNA that contain introns
Global alignment
Finds the optimal alignment over the entire length of the two sequences under comparison eg
-genes are being aligned whose sequences are of comparable length
-entire gene is homologous