Databases Flashcards

1
Q

Uniprot

A

protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

EMBL

A

gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ENSEMBL

A

genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

NCBI

A

bacteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

methods for sequence comparison

A
  1. Diagonal plots
  2. FASTA
  3. BLAST
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

FASTA

A

speeding up alignments with hash tables, heuristic algorithm, usage of K-tuples to search for matching sequence patterns of K-tuple hits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

BLAST

A

an algorithm for comparing primary biological sequence information, optimized for speed use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

blastn

A

compares your nucleotide sequence with database nucleotide sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

blastp

A

compares your query protein sequence with database of protein sequence that were derived from cDNA of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

blastx

A

first translates your sequence into amino acids in 6 reading frames then compares the protein sequences with protein databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

tblastn

A

compares your query protein sequence with the database after translating each nucleotide sequence into protein using all 6 reading frames

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

tblastx

A

translates both query nucleotide sequence & the database sequence in all 6 reading frames & then compares the protein sequence. looks for protein coding regions. Good choice- less noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

PROSITE

A

protein database. Its uses includes identifying possible functions of newly discovered proteins and analysis of known proteins for previously undetermined activity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is PSI-BLAST

A

(position specific iterated BLAST)- iterative search using protein BLAST algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how is PSI-BLAST used

A
  1. a list of all closely related proteins is created
  2. these proteins are combined into a general “profile” sequence, which summarizes significant features present in their sequences
  3. a query against the protein database is then run using this profile. larger group of proteins is found
  4. this larger group- used to construct another profile-> process repeated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

HMMR

A

software for working with sequence HMM (hidden markov models= generalization of protein models).

17
Q

Pfam

A

protein family database. looks at domains & protein family definitions & HMM

18
Q

MFFT & Clustal Omega

A

MSA program for amino acids or nucleotide sequence

19
Q

programs for phylogenetic tree constructions

A
  1. clustal w: distance method
  2. Phylip/protpars: parsimony
  3. Tree, PaPa: progressive alignment, followed by parsimony
20
Q

PDB (protein data bank)

A

s a repository for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids

21
Q

PDB contents

A

x-ray structures
NMR structures
EM models
Models from predictions/modelling

22
Q
3D structures (proteins) databases 
- hierarchal fold classification
A

SCOP (annotaed)
CATH (automated)
= both have classes, fold, superfamily, family

23
Q

PROSITE & Pfram

A

sequence based classification of protein domains, families

24
Q

ENZYME

A

enzyme nomenclature

25
Q

BRENDA

A

nomenclature, isolation & purification, stability etc

26
Q

KEGG

A

resource for understanding high level functions & utilities of the biological system

27
Q

Programs defining secondary structures (protiens)

A

PSI-PRED

28
Q

Predicting simple sequence features (proteins)

A
Signal P ( signal peptide) 
Target P (cellular localisation)
29
Q

Ab initio homology modelling

A

Rosetta: generate the model by adding fragments together

30
Q

software/tools for systems biology

A
  1. obtaining data sets from databases ( TCGA, cBioPortal)
  2. first analysis of data sets based on gene expression- MultiExp
  3. Networks- String, IMP
  4. Pathway analysis: PANTHER, DAVID, KEGG
  5. Metacore
31
Q

HADDOCK

A

high ambiguity driven protein-protein docking

-> use of biochemical and/or biophysical interaction data

32
Q

SwissPro

A

reviewed manually. high quality manually annotated & non-redundant protein sequence database

33
Q

TrEMBL

A

unreviewed. contains protein sequences associated with computational generated annotation & large scale functional characterization