Module 4- Structural Biology And Biomedical Research Flashcards
Methods of determining protein structure
X-ray crystallography
Cryo-electron microscopy
Nuclear magnetic resonance
How does X-ray crystallography work
Diffraction occurs from protein crystal defracting light
Diffraction pattern is recorded
Diffraction compared with optical microscopy
Crystal is made by being frozen
Crystal diffraction pattern
Spots vary in intensity, size, location and scattering angle
Can predict where spots will be in a pattern
To measure the diffraction, measure the intensity and location of spots
Diffracted spots transformed into electron density map with which a model is built from
Superimposed meaning
Put on top of one another
superposed meaning
Align each and put on top of one another. Can get RMSD when done. RMSD increases with more difference in structure
Advantages of protein crystallography
Can obtain high resolution structures
High structural detail possible
Good for studies on bound ligands
Often produced nobel prize research
Disadvantages of protein crystallography
Need pure stable protein
Need protein crystals
Need diffraction equipment
For membrane proteins- very hard to obtain crystals
Basis of cryo electron microscopy
Single particle images generated by TEM
Alignment of single particles and 2D class average to get 3D
3D classification average to get 3D refinement and final high-resolution structure (20S proteasome)
Can get atomic structures from this
Pros of cryo electron microscopy
Crystals not required
Yields atomic model
Good for membrane proteins
Cons of cryo electron microscopy
Resolution limited
Size limited
Expensive
Sample prep is hard
Pros of NMR
No crystal needed
No cryoEM required
Cons of NMR
Expensive
Isotopic labelling often needed
Structure not as accurately determined generally as X-ray
Goal of structural study
Explain a biological or biochemical question
Framework to understand important scientific question
Steps in x-ray crystallography structural determination
Purification of protein
Crystallization
Measuring diffraction
Phasing- electron density map
Building an atomic model
Refining the model
Interpreting the structure
Protein purification in crystallization
Done many ways; over-expression in cells, affinity tags, performance hardware
Want a soluble, unaggregated uniform pure protein
5-10mg at ~95% purity, ~10mg/mL
Stable in aqueous buffer
How is protein crystallization carried out
Also known as controlled precipitation hoping crystal formation is energetically favourable over amorphous precipitate
Equilibration with conc salt, PEG in small wells (10ug) in plates
Trial and error
Crystallization is rate limiting step
Common precipitants in protein crystallization
PEG (polyethylene glycol polymer) 400, 8000, 20000
Ammonium sulfate
PEG/ lithium chloride
Phosphate
Organic solvents
Variables in protein crystallization
Protein concentration
Precipitant concentration
pH
Ligands, additives
Temperature
Saturation in crystallization (saturated, unsaturated, supersaturated)
Saturated for a substance= can no longer dissolve any more of it
Unsaturated= can dissolve more
Supersaturated= has somehow dissolved too much, not stable situation
Crysallization is move from supersaturated state to saturated state- at end have crystals more ordered than the precipitant and saturated protein
Measuring diffraction/ data collection in crystallography (diffraction collection)
Crystals mounted in path of x-ray beam
Rotated to yield diffracted spots of reflections
Crystals must diffract well and be stable in x-ray
Often flash frozen to make stable
Spots away from beam stop in diffraction beam= high resolution. Close= low resolution
Law associated with diffraction and what it means
Braggs law n(wavelength)=2d sin x angle
When they are equal= constructive interference
Proteins are packaged in a lattice separated by d in angstroms
Only some angles give spots
Lattice
Packaging arrangement in a crystal
When crystallizing occurs protein molecules pack into lattice based on charge and shape
Packing extends along 3 axes (a b c)
Unit cell
Unique unit of volume or area which can be used to build up a complete lattice
Built up by translation along a b c axes
3D analogy: bricks in brick wall
2D analogy: tiles on floor
Protein crystal composed of unit cell of protein molecules stacked together
Asymmetric unit
Smallest part of unit cell that can be used to create the complete unit cell when all the internal symmetry elements in the crystal are applied
Space group
Combination of unit cell symmetry and crystal structure lattice that describes a given crystal under study is called space group
Phasing and intensity in crystallography- what it is
X-rays are magnetic wave with intensity and phase
Phase is where wave peaks are relative to each other. If they align, get bigger wave, if not then cancel each other out and get nothing
Need intensity and wave of diffracted x-ray for electron density- structure factor
Diffraction has intensity info
Structure factor in crystallography
F or IFI
Numerical value of structure factor proportional to square root of intensity of diffracted wave
Phaase angle is 0-360
If F measured experimentally= Fo (observed)
If F calculated from model= Fc (calculated)
F can be added together to calculate electron density maps then used to create protein models
Electron density in crystallography
Electron cloud that surrounds each atom
Molecules modelled into density like hands in gloves by computer programmes
Protein structures are atomic models that have been built in an electron density map
How is phasing of diffraction pattern done
Phase of reflections lost in data collection, getting it back is phasing
Good phasing= good maps= accurate structures
Patterson or direct methods, MIR, MAD or MR
What is patterson or direct methods for phasing diffraction pattern
Math function solved to locate atoms in unit cell
Once located, phases calculated
Works with small number of atoms- not proteins directly
Useful to find heavy atoms within protein eg metals
What is multiple isomorphous replacement MIR for phasing
Based on soaking heavy atoms into protein crystals
Requires multiple crystals and data sets
Allows patterson method to be used
Uses data from 2 kinds of crystals, when combined, data solved by patterson to locate heavy atoms
Heavy atom locations then lead to locating protein atoms
What is multiwavelength anomalous dispersion MAD for phasing
Based on use of selenomethionine in place of methionine
Related to MIR- doesnt require soaking crystal in heavy atom solution
Requires collection at synchrotron
Collect 4 datasets at wavelengths to maximise effects of selenium
MAD variant SAD uses 1 wavelength at peak
MAD/SAD grown to dominate phasing of structures
What is molecular replacement MR for phasing
Uses already solved structures- source for starting phases
Requires structural homology
Good for mutants of known structures
Good for high identity between sequences
If initial model is wrong leads to structures with errors= phase bias
Most common method to solve homologous structures
What is amplitude
Amplitude is the magnitude of structure factor
Fitting crystallography to an electron density map (building the model)
Phasing yields empty density hopefully following chain of structure of protein
Building means inserting atomic structure into envelope of electron density
Room for error, manual checking required
Electron density map from combining different amplitudes and phases
Types of electron density maps- combining different amplitudes and phases
Fo
Fo-Fc= difference map
2Fo-Fc
Omit map 2Fo-Fc with omit phases
What is an Fo map
measured intensities/ observed amplitudes with phases from model usually combined with a(calc)- phase from MIR/MR/Paterson
What is an Fo-Fc map/ difference map
phases from model and observed-calculated amplitude, will see +ve and -ve density
Positive means may need to add atoms (green)
Negative means may need to remove atoms (red)
(Fo-Fc)acalc
What is a 2Fo-Fc map
Observed- calculated amplitude, phases from model
Has features of Fo and difference map
Most common map in protein work
(2Fo-Fc)acalc
What is an omit map
Special map used to test part of the model that seems uncertain
Intensities are 2Fo-Fc and phases from a model with some atoms left out
Things left out remain in the map if supposed to be in structure
(2Fo-Fc)acalc [part of model omitted]
What is refinement
Moving atoms to minimise difference between theoretical diffraction data calculated from observed model and data measured or observed experimentally
Done after initial fitting to density
Minimise (Fo-Fc): perfect model Fo=Fc
More on how refinement is done
Shifting protein and solvent atoms into best xyz positions for atoms while preserving correct geometry
Also need to calculate occupancy (what % of unit cells have atom/ligand etc) and temperature factor (relates to thermal motion of atoms) for each atom
More on temperature factors
B-factors
Number assigned to each atom in a structure, measuring thermal motion 0-100
Atoms with high motion show different in maps
Can indicate disorder or bad data
B<20 excellent, <20-40 common, >60 mobile or disordered
Waters have high B factors- can be good
Plot of B vs residue number tells mobile regions in protein structure
Reporting crystallography results
Result of refinement is optimised model
Recorded in PDB file format
Chain no, residue, atom, xyz, occupancy, B-factor
Deposited in PDB along with structure factors
Crystallography interpretation and analysis
After completion of refinment analyse structure quality, topology, tertiary structure and active site
Develop a model
Devise experiments to test model
Three ways to determine crystallography structure quality
Quality of X-ray diffraction data sets
Quality of X-ray structure determination and refinment
Quality of overall protein model
Ways that quality of data sets is measured (how well are spots measured)
Resolution
I/sigma (signal to noise)
Completeness
Multiplicity
R(merge) R(pim) CC(1/2)
Resolution
Indicator of quality
High resolution- away from beam stop and low number= fine structural details
Low resolution- close to beam stop and high number= coarse structural details
Determined by Bragg angle of diffracted sports
Higher angle> higher resolution
What different angstrom measurements correlate to in resolution
6= overall shape
3= trace chain
2.5= carbonyl
2.0= holes in aromatics
1.5= see individual atoms
1= see H atoms
Signal to noise
I= intensity, sigma= standard deviation of I
Ratio of signal to noise
Can be per reflection, per data shell, per data set
Common high resolution cut off is 2-3
Full data sets: 5-10 is fair, >20 is good
Analysed by computer programmes
Completeness and redundancy
How much of available data is being measured
How many times is each reflection being measured
Consider in context of signal to noise
Analysed by computer
Gives % of completeness shell
Average multitude= redundancy
R-merge
How well spots which are measured multiple times agree
% of difference in intensities
Historically important, goal was 10% or less
Big data sets have high numbers- newer stats R-pim and CC developed for big data sets so this one no longer as important
CC(1/2) and R-pim
R-pim is redundancy corrected R-merge
CC(1/2) is correlation coefficient between shells of data, ideal number not clear- usually about 0.5 or higher, below 0.15 is non-significant correlation
How is x-ray structure determination and refinement quality judged
R-factors
Geometry
Temperature factors
R-factors
Sum of Fo-Fc / sum of Fo
Dependent on geometry and completeness
Rwork= crystallographic R factor where <15% excellent <20% good >20-23 ok, >25 means errors. Calculated with 95% of reflections which are used in carrying out refinement
Rfree calculated with unbiased data 5% of reflections never used in refinement
If both decrease, model is improving, if Rwork decreases and Rfree doesnt then model likely over-fitted
geometry
How well structure fits expected structural properties of proteins
Goal is bond length deviation <0.02 angstroms
Goal bond angle <1.8%
Must be restrained during refinement
Effect of R-factor: if ignored then can make R-factor as low as desired
See ramachandran plots
How to check quality of model- quality assessment tools
Web based-
PDB validation: Overall refinement metrics and Overall chain graphics
Procheck
What is the cry-EM revolution
Most rapidly developing technique in structural biology
Growing field and features in major publications like Nature, Science and Cell each week
Key starting point for vaccine and drug design
Has been a huge increase in resolution due to improved detectors, microscopes, software and sample prep
Steps in cryo-EM single particle reconstruction (SPR)
Sample preparation
Sample freezing
Screening and data collection
2D and 3D reconstruction
Refinement and model building
Interpretation
How is cryo-EM sample prep done
By generating a higly pure and homogenous protein
Good for membrane proteins as dont need a crystal
Also by negative staining with heavy-containing atom stain- fast, easy, good overall view of protein and good as pre-cryo screening step
How is cryo-EM sample freezing done
Dilute protein prepared in aqueous solution
Frozen on carbon layer sitting on 3mm diameter copper grid
Plunged into liquid ethane sitting in liquid nitrogen
Having ideal sample prep hardest step
Want water to freeze without ice as glass
Cryo-EM screening and data collection
Want particles present in different orientations on EM grid to have all projections (wide distribution)
Want good density on micrograph where individual particles are seen
Want drift to be able to be well managed (apparent movement of specimen)
Pick good particles, computer can align and see orientations and determine if all are there
CryoEM 2D and 3D reconstruction
Projections can be simple or deceiving
Need robust number of presentations and orientations for 2D construct to be made
After picking those for 2D construction, they must be shifted, translated, rotated and paired. Then grouped and averaged (alignment)
Simple objects can have complex projections
2D images become 2D classes
2D classes grouped into making 3D structure
Refinement and model building in cryo-EM
Done on 3D classification, fine detailed improvements
Can then build protein into the refined structure
Why is structural biology research done
Starting point for biomedical research aimed at understanding molecular basis of disease
Major jumping off point for experiments in drug and protein design
Eg HIV protease structure was found, allowed to determine how it functions in disease and could make inhibitors based off of it