Structure Validation, Terminology, Protein Data Bank (PDB) Flashcards

1
Q

What is the quality criteria for structure validation

A
  1. Resolution
  2. R-factor & R-free
  3. Geometry
  4. B-factors
  5. Other experimental data- Does the model agree with biochemical and other data (mutagenesis, kinetics, spectroscopy etc.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Who checks the X-ray crystallographer?

A
  1. The reviewers and researchers
  2. The protein data bank (PDB)
  3. Competing groups working on similar structures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the R-factor

A
  1. One of the key statistics for judging a structure’s quality
  2. Does the model reflect the actual experimental data?3. The residual (or fraction) of the data that the model does not explain
  3. Low resolution structures can be as high as 30%
  4. For exceptional sub-atomic resolution structures as low as 10%
  5. R-factor usually around 20-25%
  6. The fundamental reason for the difference is the crystal quality (purity of the sample and conformational flexibility of the molecule) and accuracy of the model (phasing quality and resolution)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is R-free

A
  1. Calculated the same way as R-factor but only looks at a fraction of the data that has never been used to the refine the structure
  2. 5-10% of reflections removed randomly from the data set prior to refinement
  3. Reflections for entire dataset called work set or used
  4. Reflections for removed reflections called test or free
  5. Unbiased measure of the success of structural refinement
  6. The refined model has never seen the omitted data so the comparisons report an unbiased evaluation of the accuracy of the model
  7. Indicator of incorrect modelling when&raquo_space; R-factor
  8. For good models usually no more than 5% higher than R-factor
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why does R-free give a more objective measure of the quality of the model

A
  1. Not biased towards these reflections

2. Avoids model bias and overfitting of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What geometry does the model need to agree with

A
  1. Model must have reasonable bond lengths, bond angles and overall geometric agreement compared to other well-defined structures
  2. Deviations for bond length <0.01 Å with angle deviations <2° compared to ideal values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe a Ramachandran plot

A
  1. Define whether or not the main chain dihedral angles fall into spatially allowed conformational regions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why do you need an average of atom position

A
  1. If we could hold an atom rigidly fixed in one place, we could observe its distribution of electrons in an ideal situation
  2. Image would be dense towards the centre with the density falling off further from the nucleus
  3. But Electrons usually have a wider distribution
  4. Due to vibration of the atoms, and/or differences between the many different molecules in the crystal lattice
  5. The observed electron density will include an average of these small motions
  6. Slightly smeared image of the molecule
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the amount of smearing proportional to

A
  1. Describes the degree to which the electron density is spread out for each atom
  2. The amount of ‘smearing’ is proportional to the magnitude of the B-factor
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the B-factor an indicator of

A
  1. An indicator of thermal vibration of atoms

2. Indicates the true static or dynamic mobility of an atom, and also errors in model building

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is the electron density of an atom is broadened by disorder in the crysta

A
  1. Local static disorder - Atom positions change from one unit cell to another
  2. Local dynamic disorder - Atom positions change over time during the measurement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are B-factors used for

A
  1. B-factors are introduced to account for disorder in the atomic model
  2. Confidence measure for location of each atom
  3. On scale from 1-100 Å2
  4. If an atom on the surface of a protein has a high temperature factor
  5. Atom is probably moving a lot and you are only observing one possible snapshot of its location
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do different B-factor values tell you

A
  1. Values <10 will create a model of the atom that is very sharp
  2. Atom is not moving much and is in the same position in all the molecules of the crystal
  3. Values >50 indicate that the atom is moving so much that it can barely be seen
  4. Atoms coloured by temperature factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What colours and where do you find different b-factrs

A
  1. High values (lots of motion) in red and yellow
  2. Low values in blue
  3. The protein interior (core) has low B factors but the surface residues have higher values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the PDB- protein data bank

A
  1. The PDB archive contains information about experimentally-determined structures of proteins, nucleic acids, and complex assemblies
  2. Structural biologists determine the location of each atom relative to each other in the molecule then deposit this information, which is then annotated and publicly released into the archive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How many structures determined by XRC are in the protein data bank

A
  1. ~150,000 structures determined by XRC
17
Q

How do you identify a protein in the PDB

A
  1. Each entry in the PDB given a unique identification code e.g. 1ATP, 1TOX, 3LCB
  2. PDB files
    a) Header, summary of the protein, citation information, details of structure solution, sequence
    b) List the atoms in each protein (and solvent, water, ligands), and their 3D location in space (coordinates)
    c) Typically contains coordinates of just one asymmetric unit which may or may not be the same as the biological assembly
  3. PDB offers tools for browsing, searching and analyzing structural data
18
Q

What are limitations to X-ray structures

A
  1. Need lots of highly pure protein (~5-10 mg), so may be limited to using recombinant proteins
  2. Sometimes it is challenging to find a condition where the protein crystallizes
  3. Proteins with floppy loops or moving domains can be problematic
  4. Might not be able to crystallize these
  5. X-ray structures are static – no information about dynamics
  6. Hydrogen atoms scatter poorly and are only visible at very high resolution
19
Q

What are advantages to x-ray structures

A
  1. Protein crystals are typically half water so the protein’s environment is actually pretty physiological
  2. Structure also shows more than just protein (H2Os, metals, ions, ligands etc.)
  3. At atomic resolution (<1.0 Å) bond lengths can be measured directly instead of assumed, and deviations from canonical geometry can be seen
  4. No lower limit on protein size as long as it is well folded
  5. No upper limit on molecule size- Intact viruses and ribosome have been solved (5 MDa)
  6. Many steps can be automated- High resolution structures can be solved quickly after collecting data