Sequence Similarity Searching Flashcards

Question 1

Q

Sequence similarity

Answer

A

similiar physiochemical properties - common ancestry - common function

Question 2

Q

Homology (and similarity)

Answer

A

share common ancestry (>80% similar)

Question 3

Q

Homologous (2 types and meaning)

Answer

A

Orthologs - speciation event (similar functions)
Paralog- duplication avent (different functions)

Question 4

Q

Sequence alignment and algorithms

Answer

A

enables maximisation of similarity, most likely evolutionary path

Question 5

Q

Dynamic Programming Algorithms (allow what, 2 types, negative)

Answer

A

exhaustive identification of optimal alignments
too slow for large databases
global - whole sequence (~length)
local- local regions (biological relevance, find conserved patterns)

Question 6

Q

Scoring alignment

Answer

A

quantification of similarity (what´s real from chance) - scoring matrix

Question 7

Q

what represent gaps and mismatches

Answer

A

indels events relative to ancestor (mutations during replication)

Question 8

Q

3 types of gap penalties formula

Answer

A

constant: -a
proportional to lenght: - (a x l)
affine gap: - (a+bl) a»b b= extending penalty proportional to gap length - more relevant

Question 9

Q

formula of percentage of identically aligned residues

Answer

A

nº matches/length x 100

Question 10

Q

protein alignments substitutions of aa are not equal why?

Answer

A

protein sequences are under stabilising selection for structure and function
depend on chemical similarity - similar aa substitute more easily
LEU>ILE or PRO>TRP

Question 11

Q

BLOSUM62 substitution matrix

Answer

A

Gap free alignment of short protein motifs (BLOCKS)
Higher score - chemically similar, conservative (higher probability of homology)

Question 12

Q

Heuristic Algorithms (vs DPAs) - example

Answer

A

high scoring short regions exact matches (break query into short words and look for matches and then see if can be extended)
faster
BLAST

Question 13

Q

BLAST

Answer

A

all matches above threshold are extended until introduction of gaps
High scoring segment pairs (HSPs)
N - nt vs nt (gene)
P - protein vs protein (protein)
x- translated nt vs protein (DNA sequence code protein)
tblastn - protein vs translated nt (what DNA sequence encodes protein)

Question 14

Q

p values
significant match p

Answer

A

probability of observing as high scoring an alignment between 2 unrelated sequences of similar length and composition
significant match - p<0,05

Question 15

Q

Expect values (E)

Answer

A

how often a match at a given p value would be expected to occur in the database by chance (biologically unrelated) - threshold for significance
E= pX (should be =<0,01)
X- total length of all sequences in database/ length of aligned sequence

Question 16

Q

DNA searches
Protein searches

Answer

Study These Flashcards

A

DNA - evolutionary close
protein- distant

Sequence Similarity Searching Flashcards

(16 cards)