Structural Biology - Protein Folding Flashcards
Why is it important to understand protein folding?
Predict the 3D structure from the primary sequence
Understand misfolding related to human diseases
Design proteins with a new function
What are intrinsically disordered proteins?
Don’t have a defined 3D / globular structure
Because sequences don’t allow for formation of hydrophobic core
Are very flexible
30% of proteins are intrinsically disordered proteins
A small number can aggregate and cause disease
Neurodegenerative diseases (Alzheimer’s - alpha beta, Parkinson’s - alpha synuclein) and Cancer (P53)
Briefly explain the process of protein folding
After ribosomal synthesis, most proteins (but not all) fold into a 3D structure
Spontaneous process
Transition from denatured to native state
Each protein structure has its own free energy
Native (folded) state has minimal free energy due to increased native interactions and increased residue contacts
Explain the anfinsen experiment
Proved that the native state is the conformation with the minimal free energy
Denature ribonuclease A (has four disulphide bonds) with 8M urea and beta mercaptoethanol (BME) to unfold the protein to a random coil state with no enzymatic activity
Remove BME witch leads to oxidation of sulfhydryl/thiol groups (reforms disulphide bonds)
Remove urea witch leads to scrambled protein with no activity
Add trace amounts of BME which converts scrambled protein to native state
BME is not a catalyst only breaks disulphide bonds which leads to different conformations of the protein
Conclusion: this process is driven by the conformational free energy that is gained by going into the native structure (active conformation)
Formation of N structure is encoded in the sequence of the protein
Thermodynamics is important in protein folding - every conformation has a different energy but lowest free energy is favoured
Explain the chemical stability of protein folding
Chemical stability is the ability of maintaining the chemical structure of native states/polypeptide chain under various conditions (ex. temperature, pH, denaturing agents)
Want intact covalent bonds, oxidation states, metal coordination
Deamination of Asn and Gln residues (to Asp and Glu, respectively)
Hydrolysis of the peptide bond of Asp residues at low pH
Oxidation of Met at high temperature to methionine sulfoxide (Sign of aging in proteins)
Elimination of disulfide bonds
Thiol-catalyzed disulfide interchange at neutral pH (oxidation of disulphides, very reactive)
Oxidation of cysteine residues to thyil radicals, Cys-S·
Explain the conformational/thermodynamic stability of protein folding
Proteins fold into the Native structure because it is more stable than the denatured state
Conformational stability is the ability to adopt a well defined conformation instead of a random coil
It is given by the difference in free energy between D and N states DeltaG(D-N)
Non-covalent interactions
Psi and phi in backbone of protein rotate which allow polypeptide to assume various conformations
Tells us about flexibility of secondary structure (alpha helices/beta strands) and how these adapt (compress/elongate) to allow formation of 3D structure
Some conformations are disallowed
Glycine has the greatest accessibility
Explain proteins folding in terms of the free energy change
Free energy change upon folding, depends on enthalpy and entropy
Unfavourable entropy change by folding a flexible polypeptide
Favourable enthalpy change from intra molecular side chain interactions (Hydrophobic interactions, Hydrogen bonds, Disulphide bonds)
Favourable entropy change from burying hydrophobic groups in the molecule (Release of water molecules from hydrophobic side chains - strongest force that drives protein folding)
In total net negative delta G
Explain the different parameters which influence protein folding
Delta G varies with: pH, Temperature, Pressure, ionic strength, molecular crowding, hydrophobic effect, H-bonds, van Der Waals, disulphide bonds
Example proteins have optimal activity at a specific temperature
Crowding: thick, dense solution like polysaccharides (dextran) gives a more stable protein
How does protein stability affect their function?
Average stability (free energy) of a small protein is only 5-15kcal/mol
To fold a protein, need to break interactions in denatured state and from interactions in the native state
Balance between them is very small
So, proteins are not very stable which is important because it allows them to be dynamic to be able to perform a function and allows cells to recycle proteins
Explain how covalent interactions (disulphide bonds) determine protein folding
Oxidation forms disulphide bond, reduction breaks it
Process is reversible
Oxidation process can be intramolecular (in same protein) or inter-molecular (with different proteins ex. antibody light and heavy chains)
Cellular enzymes (protein disulphide isomerases) help form disulphide bonds
Explain how compaction determines protein folding
Proteins are compact
Compactness = amount of surface area of an object relative to a perfect sphere of comparable volume (perfect sphere = 1 but other object will have <1)
Alpha helices and beta sheets are very compact
Folding is directed by internal residues
Hydrophobic effect drives folding (hydrophobic amino acids cluster together in core of protein to minimise exposure to water)
Interactions between amino acids drive folding (ionic, vdw, disulphide)
Need a good balance between compactness and flexibility to allow function
Explain the hierarchical pattern of protein folding
Folding follows a hierarchical pattern (step-wise)
Subdomains form spontaneously
Domains are stable, independent folding units
Large proteins can contain several domains
Tertiary structure forms when these pack together
Explain the adaptability of proteins when folding
Proteins are adaptable
ex. rigid hydrophobic core/packing of non-polar side chains
Large mutations don’t affect backbone of protein
Example: Phenylalanine (aromatic) to alanine mutation in T4 lysozyme showed that a benzene was occupying position of side chain and structure was preserved after mutation
Explain the sequence versatility of proteins (conservation of sequence and folding)
20% sequence identity (80% difference) between 2 proteins = can have the same fold/structure
But can also have the opposite: 88% identity and a different structure
Only some residues are important in determining the unique structure
List different techniques to measure protein stability
Absorbance ex. trp, tyr, chromophoric probes (but only gives small signal)
Fluorescence ex. trp, fluorophoric probes, aromatic residues (big signal)
CD
NMR
Differential scanning calorimetry (DSC) to determine delta H of unfolding
Catalytic activity
Explain the protein denaturation curve
Denaturation: loss of native structure integrity and loss of activity
Proteins can be denatured by: Heat or cold, pH extremes, organic solvents, chaotropic agents (urea, guanidinium hydrochloride)
Denaturation curve is sigmoidal
Tm = melting point, when protein is 50% denatured and 50% folded
D50% = denaturing concentration at which 50% is denatured and 50% is folded
Denaturation leads to a decrease in fluorescence intensity and increase in wavelength maxima
Ex. Scan for different concentrations of urea and take each point at which difference is the largest to form a sigmoidal curve
Explain how circular dichroism (CD) works
CD measures the differential absorption of left and right circularly polarised light by optically active media
De = eL – eR (e=molar absorbance)
Chiral molecules (handedness) are either L or R and are optically active
Circularly polarised light can be clockwise or anticlockwise
Polarised light interacts with chiral molecules, there is differential absorption
Chiral molecules absorb circularly polarised light in only one direction
Light is absorbed at diff wavelengths by the protein
Molar ellipticity (difference in absorption between left handed and right handed circularly polarised light)
If light is absorbed equally in both directions (eL=eR) get linearly (plane) polarised light, molar ellipticity=0
If light is absorbed more in one direction than the other (eL<eR), get elliptically polarised light
CD spectrum gives secondary structure information
Different secondary structure (alpha helix, beta sheet, random coils) have different CD signatures
CD spectrum shows molar absorption difference (of left and right handedness) vs wavelength
Explain the background of circularly polarised light
Light (electromagnetic radiation) is an oscillating wave of E-fields (electric) and B-fields (magnetic)
Focusing only on E-field
Oscillations can be Planar (horizontal and vertical components in phase, only in one plane) or Circular (horizonal and vertical out of phase, in 2 planes) or Elliptic (a mix)
Circular polarized light is 2 waves in different planes and one has a 90 degree phase shift
The phase shift can be -90 degrees or +90 degrees
Circular polarised light is chiral (handedness)
Left circularly polarized light = counter clockwise
Right = clockwise
Pros and cons of CD spectra
Pros:
CD spectrum recorded in the far-UV (190-240nm)
3D fold of secondary structure gives a particular CD signature
Allows monitoring of folding and unfolding
It can be used to monitor protein folding but also biological activity
Cons:
Lack of structural resolution (Only shows secondary structure, not at amino acid level)
Often over interpreted so need to combine with other techniques
Explain what the levinthal paradox is
Assume 3 conformations per amino acid in the denatured state (psi and phi angle allowed conformations)
For 100 amino acid protein there are 3^100 potential conformations
If the chain can sample 1x10^12 (trillion) conformations/sec it will take 2x10^28 years to reach the native state
There are too many different possible conformations for a protein to fold by a random search of conformations
A protein folds by following defined pathways
Most single domain proteins fold in milliseconds to seconds
Explain what makes a protein fold so fast and how intermediates can be trapped
A series of conformations are encoded in the protein sequence
Search for minimum energy conformation is not random
Direction towards native structure is a funnelled energy landscape
Free energy decreases down the diagram
At bottom (lowest free energy) is the native state
Energy landscape has local minima
Partially folded proteins (intermediates) can be trapped
This will slow down/inhibit folding
If hydrophobic groups are exposed, protein will aggregate or stick to other proteins
Explain how FRET works and when it is used
FRET (fluorescence resonance energy transfer) measures two state folding (no intermediate species D<–>N, single domain proteins)
Two fluorophores are labelled on molecules of interest
One fluorophore is the donor, one is the acceptor
Donor absorbs light at a specific wavelength and emits it at a longer wavelength in the visible spectrum
Acceptor has absorption spectrum that overlaps with emission spectrum of donor
Excite donor with specific absorption wavelength
Donor can transfer energy to acceptor when they are in close proximity (1-10nm)
Acceptor fluorophore is exited and emits light at its characteristic wavelength
EXAMPLE
Red fluorescent protein absorbs yellow light and emits red light (that has lower energy) - Acceptor
Green fluorescent protein absorbs blue and emits green - Donor
When close together, green donor will pass energy to acceptor chromophore and will emit in red
How is FRET used to determine protein folding?
FRET used for two state folding
In D state: separation between dyes is larger and transfer efficiency is small
In N state: separation is smaller and transfer efficiency is large
At equilibrium, still movement between D and N states (protein is folding and unfolding)
At D50 protein will be 50% denatured and 50% native
Denaturation is reversible, corresponds to jumps in FRET
No intermediate steps in FRET
Process is highly comparative
How is FRET measured with a device and what is the result of it in graphical form?
Stopped flow device:
Syringe with denatured and non-denatured protein
Are mixed together to initiate refolding or unfolding
Quartz cell and pass through fluorescence detector
Measure difference in signal between N and D state
Get single exponential relaxation curve for 2 state process (look at notes)
Can measure folding and unfolding rate to get delta G of protein
DeltaG=-RT(kfolding/kunfolding)
How do absorbance/fluorescence vs time curves for D to N states look like for 2 state and 3 state folding?
For 2-state (D-N): Single exponential curve
Subtract experimental data from theoretical exponential curve to get a straight line
If points are close to the X-axis fit is good
For 3 or 4 state proteins: add up multiple exponentials
Formation of intermediates occurs at different speeds
Fit data to double exponential curve (D-I-N) or fit curve to triple exponential curve (D-I-I-N) to see which one gives a straight line
Formation of intermediate occurs very quickly (4ms)
Formation of N state takes longer (400ms)
Clear separation of formation of intermediate and N state
Explain the principle of the F value analysis, what it is used for and assumptions
Experimental method to study the structure of the TS at residue level
Specific probes are used to distinguish between D and N state (CD, fluorescence, NMR)
Series of conservative deletion mutations that deletes sections of protein one by one
Removal of methyl groups
T to S, I to V (1methyl) V to A (2 methyl) I/L to A, F to L, X to A (3 methyl, more aggressive)
Any mutation should end up with alanine and not glycine as it has no side chain which affects conformation of the backbone
Compare differences in thermodynamics and folding rates between Wild type and mutant
Measure F values at different time points to make kinetic profiles
Assumptions: Assume that mutations don’t affect the structure of the N state and don’t create new interactions during folding
Explain transition state theory and folding rate
Occurs for all chemical reactions
Used to analyse two state folding
Free energy (G) vs reaction coordinate
N state has lower free energy than D state
Energy barrier from D to TS
Structure of TS tells us what interactions need to form to make the N state
TS has high energy: very short lived and unstable so can’t be characterised structurally
Folding rate is proportional to the exponential of the negative activation free energy (energy difference between TS and D state)
k =exp(-DeltaG(TS–D))
Use mutational analysis to determine two-state folding kinetics (and structure of TS)
How is F value analysis used for a 2-state folding process and how is the F value interpreted?
Mutations change the thermodynamic stability of N state
DeltaDeltaG(D-N) = DG(D-N(mut)) - DG(D-N(wt))
Folding rate (k) is related to activation free energy (energy diff between TS and D)
k = exp(-DeltaG(TS–D))
The faster the folding rate, the less stable
Some mutations can change the folding rates (k)
Measure folding rate of mutant and wild type and get difference in delta G
DDG(TS–D) = -RT ln (kwt /kmut)
The F value is the fraction of protein that achieved the native conformation at a specific folding intermediate during the folding process
Ratio between delta delta G TS-D and N-D
When F=1, mutated residue is native, that region is the folding nucleus of the protein (affects kinetic and thermodynamic stability), structured in TS
When F=0, mutated residue is non-native, unstructured in TS, is only formed in N state but not in TS (didn’t affect folding rate)
Explain the 2 competing folding pathways of D1pPDZ and how the F value is used for structural refinement of TSE?
PDZ domains are small domains
Adopt a six stranded beta sandwich fold with two alpha helices (bbbabbab)
D1pPDZ =PDZ domain of the D1 C terminal processing protease (D1p)
Has kinetic competition between 2 folding pathways: to native and to misfolding to off pathway metastable intermediate
Analysed by circularly permutated protein: change location of N and C termini
When mutants were produced, only got stable I state
N state is unstable so no advantage in forming it
Misfolded intermediate has alternative packing of N terminal of beta hairpin
TSE = transitional state ensemble, a protein that forms a stable non native intermediate (intermediate that doesn’t lead to formation of N state)
What Is the average protein concentration in a human?
Average MW of proteins in human cell = 50kDa
Average protein concentration = 20-30% (very high)
Average protein concentration (molar) = 4-6mM (varies with cell type)
But many will not reach more than 0-5mM
What different factors affect protein folding in vivo?
Protein is surrounded my many other proteins, lipids, polysaccharides in vivo
Crowding
Protein aggregation
Cellular components
Compartments
What are chaperones and what are the different types?
Proteins that (mostly) use ATP to bind unfolded or misfolded proteins and set them on the folding pathway again
Bacterial trigger factor: no ATP needed, ribosome associated, acts early
HSP70: requires ATP, eukaryotes, bacterial homologue/name in E. coli is DnaK, reverses denaturation/aggregation, works with HSP40 (DnaJ in E. coli)
Chaperonins (HSP60): have bacterial and eukaryotic homologues, large, multi-subunit, cage-like (Type I for bacteria, mitochondria, chloroplasts, Type II for archaea/eukaryotes)
HSP90 (eukaryotic): facilitates late stage folding of signalling proteins, unique regulatory role and induces active conformation
Nucleoplasmins: decameric, acidic, nuclear (mostly in nucleus), assembles nucleosomes
Protein disulphide isomerase (PDI)
Peptidyl-prolyl isomerase (PPI)
HSP70 and HSP60 are needed by E.coli to survive above 30 degrees
What is the structure of the ribosome and what does S mean?
E. coli ribosome is 25 nm diameter, 2.52 MDa (2,520kDa) in mass
Two unequal subunits that dissociate at < 1mM Mg2+
S is a unit to describe the speed at which a molecule migrates under significant force
Bigger molecules travel faster under significant force
50S = travels 50 micro metres per second under a force of 1 million G
30S subunit is 0.93 MDa with 21 proteins and a 16S rRNA
50S subunit is 1.59 MDa with 31 proteins and two rRNAs: 5S and 23S
Ribosomes are 60% RNA
20,000 ribosomes in a cell, 20% of cell’s mass
What are the three phases of protein synthesis?
Initiation: binding of mRNA and initiator aminoacyl-tRNA to small subunit, initiation factors bind, then large subunit binds
Elongation: synthesis of all peptide bonds, tRNA binds acceptor (A) and peptidyl sites (P)
Termination: termination tRNA recognizes stop codon
What is the trigger factor and why is it important in protein synthesis?
Newly synthesized proteins leave the ribosome through a narrow tunnel in the large subunit (peptidyl transferase centre, PTC)
When synthesis is still ongoing, nascent protein chains may form misfolded intermediates
These are dangerous to the cell: exposes hydrophobic regions leading to aggregation OR can be substrates for proteolysis
Trigger factor is an ATP dependent chaperone (in bacteria) + displays PPIase activity
It binds to ribosomal protein L23 and receives the synthesized protein
Allows protein to fold by ex. recognizing hydrophobic regions
If protein is very long more than one Trigger Factor binds to the protein
What is the structure of the trigger factor and individual domains?
N-terminal domain: Has some chaperone activity, L23 ribosome binding
Linker: links N and P domain
P domain: Peptidyl-prolyl activity, Auxiliary (some) chaperone activity
C-terminal domain: Main chaperone activity
Protein is constitutively expressed, but increased expression at low temperatures (when cell is cold shocked) - it is NOT a heat shock protein
Domains are flexible (rotation) and each domain is flexible on the inside (local flexibility)
Allows protein to bind substrates with other conformations
What are heat shock proteins?
Heat shock proteins are one of the biggest families of chaperones
Are expressed at low levels normally in the cell
Are overexpressed when the cells are under high temperatures
Prevent aggregation and misfolding
Bind to nascent polypeptides to prevent premature folding
Facilitate membrane translocation/import by preventing folding prior to membrane translocation
Facilitate assembly/disassembly of multiprotein complexes (ex. nucleosome)