L3 Flashcards by Adenike Adebayo

Retrieval of biological sequences in databases is based on what?

Similarity

How well did you know this?

Not at all

Perfectly

Searching biological sequence databases involves?

Submission of a query sequence and performing a pairwise comparison query with all individual sequences in a database

How well did you know this?

Not at all

Perfectly

Requirements for implementing algorithms for sequence database searching include

sensitivity
selectivity
speed

How well did you know this?

Not at all

Perfectly

Sensitivity

Refers to the ability to find as many correct hits as possible. The correct hits are considered true positives

How well did you know this?

Not at all

Perfectly

Selectivity

also called specificity, which refers to the ability to exclude incorrect hits. These
incorrect hits are considered “false positives.”

How well did you know this?

Not at all

Perfectly

Speed

which is the time it takes to get results from database searches

How well did you know this?

Not at all

Perfectly

An increase in sensitivity leads to

a decrease in selectivity

How well did you know this?

Not at all

Perfectly

an increase in speed leads to

a decrease in sensitivity and selectivity

How well did you know this?

Not at all

Perfectly

What are the types of algorithms in database searching

exhaustive
heuristic

How well did you know this?

Not at all

Perfectly

Exhaustive algorithm

makes use of a rigorous algorithm to find the best or exact solution for a particular problem by examining all mathematical combinations

How well did you know this?

Not at all

Perfectly

Heuristic algorithm

a computational strategy to find the near optimal solution

How well did you know this?

Not at all

Perfectly

How do heuristic algorithms take shortcuts

by reducing space according to some criteria

How well did you know this?

Not at all

Perfectly

what are the methods used to infer sequence similarity

Global and Local alignment

How well did you know this?

Not at all

Perfectly

Local alignment

Finds domains and short regions of similarity between a pair of sequences eg
-looking for domains within proteins
-looking for regions of genomic DNA that contain introns

How well did you know this?

Not at all

Perfectly

Global alignment

Finds the optimal alignment over the entire length of the two sequences under comparison eg
-genes are being aligned whose sequences are of comparable length
-entire gene is homologous

How well did you know this?

Not at all

Perfectly

what does BLAST stand for

Study These Flashcards

Blasic Local Alignment Search Tool

How does BLAST work

Study These Flashcards

It uses heuristics to align a query sequence with all sequences in a database. Its objective is to find high-scoring segments among related sequences.

How does BLAST perform sequence alignment

Study These Flashcards

reads in query sequence
Create a list of words from the query sequence (seeding) 3 RESIDUES FOR PROTEIN, 11 FOR DNA SEQUENCES
Search a sequence database for the occurrence of these words.
matching of the words is scored by a given substitution matrix
Pairwise alignment

The resulting contiguous aligned segment pair without gaps is called what

Study These Flashcards

high-scoring segment pair

Database search programs such as BLAST use

Study These Flashcards

scoring/substitution matrices

Scoring matrices are what

Study These Flashcards

empirical weighting schemes

Possible identities and substitutions are assigned a score based on the?

Study These Flashcards

observed frequencies of such occurrences in alignments of related proteins

What does BLASTN do

Study These Flashcards

queries nucleotide sequences with a nucleotide sequence database

How does BLASTP work

Study These Flashcards

uses protein sequences as queries to search against a protein sequence
database. Default word size is 3

How does BLASTX work

uses translated nucleotide sequences as queries which are used to query a protein sequence database.

How does TBLASTN

queries protein sequences to a nucleotide sequence database with the DNA sequences translated.

How does TBLASTX work

uses nucleotide sequences, which are to search against a nucleotide sequence database that has all the sequences translated also

What is BLAST used for?

- to detect similarity between sequences of interest. - to determine whether there are other plausible alignments between query and target sequences

What is the BLAST E-value

it provides information about the likelihood that a given sequence match is purely by chance. The lower the E-value, the less likely the database match is a result of random chance.

HSPs significances are determined by Blast using the Karlin-Altschul equation

E = kmNe -lamda(s)

E stands for

the expectation value

k and lamda are what?

Karlin-Altschul constants

m stands for

the number of letters (amino acids/nucleotides) in the query

N is the

the total number of letters (aa/nuc) in the database

If E < 1e− 50 (or 1 × 10−50),

there should be an extremely high confidence that the database match is a result of homologous relationships.

If E is between 0.01 and 1e− 50,

the match can be considered a result of homology

If E is between 0.01 and 10,

the match is considered not significant, but may hint at a tentative remote homology relationship.

If E > 10,

the sequences under consideration are either unrelated or related by extremely distant relationships that fall below the limit of detection with the current method.

L3 Flashcards

(38 cards)