Structure Validation, Terminology, Protein Data Bank (PDB) Flashcards
What is the quality criteria for structure validation
- Resolution
- R-factor & R-free
- Geometry
- B-factors
- Other experimental data- Does the model agree with biochemical and other data (mutagenesis, kinetics, spectroscopy etc.)
Who checks the X-ray crystallographer?
- The reviewers and researchers
- The protein data bank (PDB)
- Competing groups working on similar structures
What is the R-factor
- One of the key statistics for judging a structure’s quality
- Does the model reflect the actual experimental data?3. The residual (or fraction) of the data that the model does not explain
- Low resolution structures can be as high as 30%
- For exceptional sub-atomic resolution structures as low as 10%
- R-factor usually around 20-25%
- The fundamental reason for the difference is the crystal quality (purity of the sample and conformational flexibility of the molecule) and accuracy of the model (phasing quality and resolution)
What is R-free
- Calculated the same way as R-factor but only looks at a fraction of the data that has never been used to the refine the structure
- 5-10% of reflections removed randomly from the data set prior to refinement
- Reflections for entire dataset called work set or used
- Reflections for removed reflections called test or free
- Unbiased measure of the success of structural refinement
- The refined model has never seen the omitted data so the comparisons report an unbiased evaluation of the accuracy of the model
- Indicator of incorrect modelling when»_space; R-factor
- For good models usually no more than 5% higher than R-factor
Why does R-free give a more objective measure of the quality of the model
- Not biased towards these reflections
2. Avoids model bias and overfitting of the data
What geometry does the model need to agree with
- Model must have reasonable bond lengths, bond angles and overall geometric agreement compared to other well-defined structures
- Deviations for bond length <0.01 Å with angle deviations <2° compared to ideal values
Describe a Ramachandran plot
- Define whether or not the main chain dihedral angles fall into spatially allowed conformational regions
Why do you need an average of atom position
- If we could hold an atom rigidly fixed in one place, we could observe its distribution of electrons in an ideal situation
- Image would be dense towards the centre with the density falling off further from the nucleus
- But Electrons usually have a wider distribution
- Due to vibration of the atoms, and/or differences between the many different molecules in the crystal lattice
- The observed electron density will include an average of these small motions
- Slightly smeared image of the molecule
What is the amount of smearing proportional to
- Describes the degree to which the electron density is spread out for each atom
- The amount of ‘smearing’ is proportional to the magnitude of the B-factor
What is the B-factor an indicator of
- An indicator of thermal vibration of atoms
2. Indicates the true static or dynamic mobility of an atom, and also errors in model building
How is the electron density of an atom is broadened by disorder in the crysta
- Local static disorder - Atom positions change from one unit cell to another
- Local dynamic disorder - Atom positions change over time during the measurement
What are B-factors used for
- B-factors are introduced to account for disorder in the atomic model
- Confidence measure for location of each atom
- On scale from 1-100 Å2
- If an atom on the surface of a protein has a high temperature factor
- Atom is probably moving a lot and you are only observing one possible snapshot of its location
What do different B-factor values tell you
- Values <10 will create a model of the atom that is very sharp
- Atom is not moving much and is in the same position in all the molecules of the crystal
- Values >50 indicate that the atom is moving so much that it can barely be seen
- Atoms coloured by temperature factors
What colours and where do you find different b-factrs
- High values (lots of motion) in red and yellow
- Low values in blue
- The protein interior (core) has low B factors but the surface residues have higher values
What is the PDB- protein data bank
- The PDB archive contains information about experimentally-determined structures of proteins, nucleic acids, and complex assemblies
- Structural biologists determine the location of each atom relative to each other in the molecule then deposit this information, which is then annotated and publicly released into the archive