Lecture 5 Flashcards

1
Q

The SCOP database

A

Structural Classification of Proteins Database developed in Cambridge

  • Sorted proteins based on structural classes->folds->Superfamilies->families.

For example: Small proteins->Cysteine-knot cytokines->Cysteine-knot cytokines->Transforming growth factor beta - more of a functional than structural classification

a-helical

B-sheet

a+B - a-helices and B-sheets in different parts of proteins, no B-a-B motifs

a/B - Helices and sheets assembled from B-a-B motifs

a/B-linear - Line through centres of strands of sheet roughly linear

a/B-Barrels - Line through centres of strands of sheet roughly circular

Proteins with little/no secondary structure e.g. soft proteins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does SCOP struggle with?

A

Domains

Many proteins are multi-domain and SCOP assumes they are single-domain

For example: clotting factors like factor XII:
Fn2-EGF-EGF-Fn1-Kr-SerPr

Factor IX:
Gla-EGF-EGF-SerPr

Difficult to classify evolutionary origins due to domain shuffling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do we define domains

A

The Gō plot method

Defines domains using distances of amino acids from centre of a protein and whether they cluster distantly to the centre of a protein.

Conducting Go plot:
1. Calculate the radius of spherical volume of protein
2. Calculate distance from each a-carbon of each amino acid to all the others
3. If distance is greater than spherical radius, score +.

Lines are drawn in a triangle to identify protein domains

Never get completely clean triangles in real proteins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the disadvantages of Gō plots

A
  • Requires solved structures
  • Domain boundaries not always clear
  • Gō method now superseded by sequence based algorithms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain modular evolution of proteins

A
  • Domain boundaries are usually at exon boundaries
  • Not all exon boundaries are domain boundaries
  • Genome rearrangements are important in evolution of new domain combinations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does Pfam build domains?

A
  • Start with high quality protein structure (X-ray crystallography, good resolution, low angstrom).
  • BLAST PDB to find related protein structures.
  • Align these - maximise structural homology (adjust alignment so secondary structural element boundaries match).
  • Build a statistical profile (Hidden Markov Model - HMM) of ‘seed’ alignment.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Gaining a protein from PDB

A

Input seuqence and Protein sequence databank -> BLAST search

BLAST search -> Filter results (E<threshold) -> Multiple alignment sequence -> Position-specific scoring matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Hidden Markov modelling

A

Sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Showing patterns using sequence logos

A

Sizes of DNA/amino acids can be used to illustrate their presence across a series of different molecules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What software shows DNA/amino acid patterns using sequence logos

A

Pfam on a larger scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does pfam build domains?

A

Use the HMM to query GenPept – hmmsearch
Align the new hits to the HMM – hmmalign
Rebuild the HMM to include the new hits – hmmbuild
Repeat as desired, or until there are no new hits
“Structure, structure, structure” (Alex Bateman, founder of Pfam)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Disadvantages of Pfam?

A
  • Domains defined by a HMM, and HMM only as good as ‘seed’ alignment used to construct it.
  • HMM building process is iterative, so errors can be magnified.
  • Curation is uneven due to numbers of domains in Pfam
  • Viruses under-represented
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Homstrad

A

Homologous Structure Alignment Database

Used to collect good seed alignments

Used in construction of globin molecules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Not enough structures

A
  • Structure determination is far harder than sequencing
  • Illumina and Minion sequencing have made sequencing ultra high throughput
  • No equivalent technological leap forward for structural biology

The structural genomics consortium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly