Sequence Similarity Searching Flashcards

1
Q

Sequence similarity

A

similiar physiochemical properties - common ancestry - common function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Homology (and similarity)

A

share common ancestry (>80% similar)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Homologous (2 types and meaning)

A

Orthologs - speciation event (similar functions)
Paralog- duplication avent (different functions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sequence alignment and algorithms

A

enables maximisation of similarity, most likely evolutionary path

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Dynamic Programming Algorithms (allow what, 2 types, negative)

A

exhaustive identification of optimal alignments
too slow for large databases
global - whole sequence (~length)
local- local regions (biological relevance, find conserved patterns)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Scoring alignment

A

quantification of similarity (what´s real from chance) - scoring matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what represent gaps and mismatches

A

indels events relative to ancestor (mutations during replication)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

3 types of gap penalties formula

A

constant: -a
proportional to lenght: - (a x l)
affine gap: - (a+bl) a»b b= extending penalty proportional to gap length - more relevant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

formula of percentage of identically aligned residues

A

nº matches/length x 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

protein alignments substitutions of aa are not equal why?

A

protein sequences are under stabilising selection for structure and function
depend on chemical similarity - similar aa substitute more easily
LEU>ILE or PRO>TRP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

BLOSUM62 substitution matrix

A

Gap free alignment of short protein motifs (BLOCKS)
Higher score - chemically similar, conservative (higher probability of homology)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Heuristic Algorithms (vs DPAs) - example

A

high scoring short regions exact matches (break query into short words and look for matches and then see if can be extended)
faster
BLAST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

BLAST

A

all matches above threshold are extended until introduction of gaps
High scoring segment pairs (HSPs)
N - nt vs nt (gene)
P - protein vs protein (protein)
x- translated nt vs protein (DNA sequence code protein)
tblastn - protein vs translated nt (what DNA sequence encodes protein)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

p values
significant match p

A

probability of observing as high scoring an alignment between 2 unrelated sequences of similar length and composition
significant match - p<0,05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Expect values (E)

A

how often a match at a given p value would be expected to occur in the database by chance (biologically unrelated) - threshold for significance
E= pX (should be =<0,01)
X- total length of all sequences in database/ length of aligned sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

DNA searches
Protein searches

A

DNA - evolutionary close
protein- distant