4. Databases and comparisons Flashcards

1
Q

Databases with protein data

A

UniProt
InterPro
Pride
PDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Databases for genome and genes

A

Ensembl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Databases of sequence info

A

Uniprot/swissprot
uniprot/trembl
EMBL (nucleotide sequence)
Genbank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Swissprot

A
  • high level of annotation eg function, domains, PST
  • minimal level of redundancy
  • quality of annotation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

EMBL (embl-ebi)

A
  • have eg emboss needle, interpro etc links
  • sequences submitted directly by scientists
  • literature and patterns
  • little error checking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ensembl

A
  • useful when analysing genomes

have gene and find mammalian homologes or eg all known SNPs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

why sequence comparison

A
  • identification of protein
  • search for homologies
  • evolution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Methods for sequence comparisons

A
  • Diagonal plot
  • BLAST
  • FASTA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

FASTA

A
  • use scoring matrices
  • use k-tuples
    • look at more than one residue at the time
  • hashing
    • make dictionary/table of k-tuples
  • find clusters of k-tuples
    • then you do dynamic programming
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

BLAST

A
  • faster than FASTA, also more sensitive now a days
  • use “words” instead of k-tuples
    • instead of identical hits you use score threshold
  • generally it looks at longer sequences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

E-value

A

a parameter that describes the number of hits one can “expect” to see by chance when searching a database of a particular size. Essentially, the E value describes the random background noise. The lower the E-value, or the closer it is to zero, the more ”significant” the match is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly