Protein Folding Flashcards
Give an overview of protein folding?
There are around 30,000 ORF in the human genome and 100,000 different proteins (due to PTM and complexes) - good example of evolution
Protein structures: Fold with high fidelity Dynamic Bind tightly and specifically Control and degradation - fold/unfold during their life e.g. Proteasome unfolds proteins for degradation
They fold in cells - a very crowded environment
Molecular chaperones assist folding in vivo
They slow down the reaction - by holding the ‘protein’, increasing the yield of the functional native state and preventing aggregation
Protein sequences encode: structure, function, stability and control
What is the protein folding problem?
How the structure is encoded in the amino acid sequence
How the polypeptide chain finds the correct fold rather than many other alternatives
As diseases of folding are a major threat to human health today
What are protein folding energy landscapes?
Look like an upside down cone
Entropy in the x/y domain and enthalpy in the z direction
Demonstrates a pathway of how proteins arrive at their native state
Unfolded proteins lie at the top of the funnel
We want the one on the left - a smooth landscape
They are never smooth - always rugged as the sequence is trying to fufil many different features
This can also reflect the effects of selection towards a different amino acid within evolution
When it is folding it may get trapped and has to enthalpically overcome this barrier
Leads to opportunity for misfolding
Why are protein folding landscapes hard to map?
This is hard to map as proteins:
They fold rapidly
Folding is heterogeneous - therefore there isn’t just one pathway down to the native state
We want to understand the conformational properties of unfolded/partially folded states but we can’t make crystals of these intermediates
What are some applications of protein folding?
Structure prediction: understanding structure-sequence relationships
Medicine: protein folding diseases
Biotechnology: e.g. Refolding inclusion bodies, engineering protein stability
Protein folding in vivo: ribosomes, assisted folding chaperones, protein trafficking
De novo protein design: tailor made enzymes (David Baker)
How can proteins unfold reversibly?
They can unfold reversibly by:
Temperature, pH and add/dilute chaotropes
Chaotropes - guanidinium ion or urea (they have many amino groups so they can hydrogen bond - displacing other molecules)
They also reduce the size of the hydrophobic effect
- 8M urea, denatures the protein, and mercaptoethanol cleaves its disulphide bonds
- Removal of the denaturant and reductant allows the protein to renature and re-form disulphide bonds in the presence of oxygen
What did the experiment for reversible unfolding of proteins show?
The primary sequence determines the structure
An example of spontaneous self-assembly
The realisation that mutation causes diseases of protein misfolding
What is the Levinthal Paradox?
If a protein occurs by exhaustive search then time taken to fold a relatively small protein would take longer than the age of the earth
Therefore, folding is NOT a random process
Levinthal hypothesised that folding is kinetically determined = there must be a protein folding pathway
Proteins fold along defined pathways on funnel shaped landscapes
What was the basis of the experiment used to determine information required to fold a chain of amino acids into a functioning protein?
Ranganathan looked at the conservation and co-evolution (CC) of the amino acid sequence
If the amino acid are important for folding and they are contacting within the native structure - if one residue mutates the changes the amino acid
The residue this contacts within in the structure will be distant in the sequence but close in the structure is also highly likely to mutate
E.g. If a residue is made smaller the contacting residue may be made larger to maintain the residue contact in the native structure
Describe the experiment used to determine information required to fold a chain of amino acids into a functioning protein?
He used a statistical coupling analysis of 120 members of the WW domain family
Looking how many residues in a protein sequence were conserved and coevolving
Used a WW domain - that binds a proline rich motif (PPxP)
To see if they were mutating independently
This method theorises that regardless of spatial location or underlying mechanism, the conserved functional coupling of sites in a protein should drive their mutual coevolution
Phage display and other binding assays show that the CC sequences are functional and conserve specificity
Folding the native state involves efficient packing of hydrophobic atoms within the interior of proteins,
This leads to a higher average sequence conservation in the core of proteins
Of the 36 aa in this WW domain only 8 sites define fold and function
The folding problem may be much less complex than previously thought
How do we define how a protein folds?
Describe the structures of intermediate partially folded states
Describe the energetics of the process (rates and barrier heights)
We need to understand: Chain collapse Structural properties of intermediates When does tertiary structure form Does non-native structure form When is the reaction complete How do proteins misfold
What case study can we use to study protein folding?
Lysozyme
Hen lysozyme small (129 aa) 4 SS bonds enzyme glycosidase soluble, globular protein mixed a/b fold - therefore interesting to see how it folds X-ray + NMR structure
What do we do to lysozyme?
Lysozyme is denatured in 6M guanidinium chloride
Its refolding can be determined following rapid dilution of denaturant (using stopped flow methods)
Lysozyme - give an overview of measuring how proteins fold?
Initiating folding/unfolding - ps-hrs
Add a chaotropes (urea) or temperature for example
Monitoring folding - CD (circular dichroism)/fluorescence
Need to combine as many methods as possible to obtain a detailed picture of the folding/unfolding process - as it is in real time
What is folding kinetics - stopped flow?
Unfolded protein in one syringe and a buffer in the other
They mix and fill up a cuvette which we can stop periodically to measure the different parts of the experiment
There is a burst phase - ‘dead time’ of the instrument
The ‘burst’ is the entire forward reaction happening immediately
What are some other methods of monitoring the folding of lysozyme?
Stopped flow tryptophan fluorescence
Real time NMR
Stopped flow CD - circular dichroism
Folding of the protein from 6M guanidinium chloride shows very different behaviour with the different spectroscopic probes
Looks at the % of native molecules with time
Describe stopped flow tryptophan fluorescence?
The denatured state is more fluorescent than the native state (around 130%)
Start - denatured state
Collapse - when we dilute the protein out of the chaotrope the fluorescent signal rapidly degrades, similar fluorescence to native state (110%) = chain collapse (within dead time - 3 milliseconds) - starts burial of aromatic residues
Further burial of aromatic residues (including tryptophan) from the solvent - it goes to a lower fluorescent state (partially folded state - 60% - a kinetic trap) than the native state
This shows the environment of the aromatic residues is non-native (20 milliseconds)
Forming the correct native state involved precising packing of aromatic residues - by excluding solvent (takes 300 milliseconds)
Describe the use of tryptophan within lysozyme?
Contains 6 tryptophan residues in its sequence
Two of these residues are highly exposed to solvent in the native protein (within the beta sheet) despite their hydrophobicity as they are involved in substrate binding
This is because they line active site and are therefore involved in substrate recognition - it binds carbohydrates
Describe real time NMR - lysozyme?
This determined which tryptophan residues were exposed
Native - hydrophobic packing is highly specific
Non-native - hydrophobic collapse is very non-specific
Lysozyme folds via two or more intermediates, involving rapid collapse to a species containing non-native burial of tryptophan side-chains
Describe stopped flow CD (circular dichromism)?
2 types: far UV and near UV Far UV: (190-240 nm) - secondary structure forming Near UV (240-300 nm) - tertiary structure forming
Far UV:
Plugged into CD - spectro-calorimeter
CD measures - the torsion angles in the backbone of the protein
Native state has a lot lower CD than the denatured state
Native-like secondary structure forms at the same time with collapse in the first ms of folding
Non-native interactions give rise to an ‘over-shoot’ in the CD (and undershoot in fluorescence)
The rate-limiting step involves reorganisation of the non-native contacts as the native structure forms
What methods can we use to obtain residue-specific information?
- Mutational analysis (phi-values)
2. Hydrogen exchange NMR
Describe hydrogen exchange?
Only in beta sheets/alpha helices contain these amide groups on the outside
The hydrogens are exchanged to deuterium
Exposed hydrogens are exchanged fast, whereas the buried hydrogens are protected and therefore take much longer for exchange
Rate of HX critically dependent upon pH
The rate of HX depends on the formation of secondary structure and burial from solvent
Therefore, HX combined with 1H NMR is a powerful method capable of revealing residue-specific information about folding
All NH can be assigned to individual residues by NMR
A 2D NMR spectrum can be acquired in around 20 min and lysozyme folds in 1 sec - not very time effective?
What is a more time effective method than hydrogen exchange?
Pulsed hydrogen exchange labelling
1. Start - denatured protein in D2O - all amides are deuterated
2. Over time - Partially folded protein - all amides are deuterated
3. Label in H2O at high pH - partially folded protein
Amides involved in persistent secondary structure are protected from exchange, all other amides exchange with H2O
4. Quench at low pH - partially folded protein - amides are differentially labelled
5. Complete folding - native protein - amides labelled H or D are detected by MNR or mass spectrometry
What is the pulsed quech flow of lysozyme folding?
Amides in the a- and b-domains are protected cooperatively
Amides in the a-domain protected before those in the b-domain
Amides in each domain are protected in 2 phases (fast and slow)
HX NMR tells us - lysozyme folds by simplifying the search and folds by domains
Describe the overall folding of the lysozyme?
The lysozyme folding pathways at pH 5.2, 20°C
Denatured state -> collapsed state (<3ms) - where the tryptophan’s are buried and is very dynamic
3/4 of collapsed state -> a-domain intermediate - where the a domain is folded, the beta domain is not and some of the tryptophan residue want to be exposed
A-domain intermediate -> native state (300 ms)
1/4 of collapsed state -> native state - they don’t misfold
Lysozyme folds through multiple routes and by domains
Give a summary of small protein folding in vitro?
Small Proteins (< 100 residues)
Small proteins fold on relatively smooth landscapes
Intermediates are not populate
Only a few contacts are needed to define the native fold
The remaining structure consolidates around this ‘nucleus’
Give a summary of large protein folding in vitro?
Folding of large proteins
Larger proteins fold on rough energy landscapes
There are multiple pathways
Intermediates are populated
Proteins fold by domains
Simulation of the 129-mer protein hen lysozyme shows multiple pathways and the existence of distinct intermediates