Module 5- Interactions/ Analysis/ Modifications Flashcards
Protein abundance in a yeast cell
~42 million proteins per yeast cell
Abundance varies from 3/5 copies to 7.5E5 copies per cell
Median abundance of 2622
Protein abundance in a human cell roughly
Cell is ~1/2 protein
~300mg/mL of protein in a cell
Human has ~20,000 protein encoding genes
Generation of protein diversity in humans
Comes from isoforms and PTMs
1 gene can lead to a range of different proteins which are different due to genetic (splicing creating isoforms) and PTM
Different protein interactions leading to biological functions
Genetic pathways- signalling pathways/ sequential interactions
Pathway scaffolding
Enzymatic reactions
Molecular machines- form stable complexes, not much changes
Things important in protein-protein interactions
Domains are cornerstone
Intrinsically disordered regions important
Features of intrinsically disordered regions
Not well defined structures
Highly modified and therefore have a lot of diversity- more tolerant to PTMs, insertions and deletions
Domains bind to disordered regions
What is a yeast-two-hybrid assay
Where a DNA-binding domain and activation domain which normally bind together are separated and have tags put on them (bait and prey)
Can use two methods to validate this: either causes transcription of essential gene and get growth or leads to transcription of reporter gene such as causing fluorescence
Pair-wise interactions
Affinity purification mass spectrometry
Extract proteins and complexes from cells, use antibodies to do mass spec on these complexes and get data on them. Doesn’t just give pair-wise interactions
First example of experiment done- what does it mean that they had the largest data set compared to literature curated data but there was still heaps missing?
Luck et al. 2020
They compared their dataset obtained from their experiment with data curated from literature and had the biggest and best dataset
Isnt complete because it was a yeast two-hybrid test so therefore, interactions in mammalian cells wont all be in yeast cells such as ones needing PTM, if a complex is required for interaction to occur, or membrane proteins being mis-localised and wont be with their partner
First example of experiment done- what two methods were used to validate interactions found in yeast two-hybrid, making hypothesis to be tested
Protein x onto jak protein, protein y onto stat
If x and y interact then jak P stat and then the P stat causes transcription of the reporter gene
X and y both attached to proteins that when they react cause a fluorescent result
Different types of protein-protein interaction interfaces
Pre-formed interface
Conformational change leading to an adaptation of shape to cause interaction
Folded/ ordered domains binding disordered structure
Disordered structure folding upon binding with domain
Two disordered structures coming together and folding
General properties of interfaces/ how are they characterised
Overall amount of surface area buried
Chemical composition of the buried surfaces (enriched for aromatic resides)
Shape and charge complementarity of occluded surfaces (close packing)
Specific interactions such as hydrogen bonds and decreased flexibility
Features of size of the interface/ buried surface area (BSA)
Difference in surface area of two proteins alone compared to when they are in a complex
Can cause underestimation as it assumes that proteins come together without conformational change which isnt always the case
Average ~18,000 which is overestimate as mainly only large stable proteins have been sized
Less weaker transient complexes have been solved which tend to be smaller
Foglizzo and how interface was found
Size exclusion of protein showed that it was a dimer and each subunit had three functional units. Interface area was small and there was three possible ways it could have come together with this small interface
Did mutations in each possible interface and size exclusion on each, showed that one interface mutation led to a monomer which must be the interface as mutation meant they could not come together as dimer
More experiments could then be done to fond out how it interacts
Composition of interface residues
Ratio of <1= residue less likely to be in interface eg acidic or polar
>1= residue more likely to be in interface eg aromatic and arg
Core and rim of interface
Core of interface is buried and has no contact with solvent, likely to have aromatic and hydrophobic residues, look like protein interior
Rim of interface has parts buried and exposed to solvent, likely to have more polar residues here, looks like rest of non-binding surface
Water molecules in interface interactions
Often make key connections, often sitting around edge of rim near polar residues
May be found in unexpected areas, not likely to be there though
Obligate and non-obligate
Obligate= always in a complex
Non-obligate= regulated interactions, sometimes monomers, sometimes dimers, sometimes in complexes
Shape and charge complementarity- scoring
Different scores used to measure packing of proteins in complex
How close residues are, how many holes there are, how they come together- complementarity
When interfaces are small this is hard to determine
Close packed= 0.7, likely real interface
Crystal packing, not close= 0.3/0.4
Conformational change and complementarity and size
Smaller interface has less conformational change- less entropic cost= more pre-organised and fit together
Larger interface has more conformational change- more entropic cost
Conformation and anchor residues
Small interfaces tend to have anchor residues which come together and help binding interactions
Other residues then move around these anchors to optimise contacts
Three types of ways proteins may come together and interact and which is more likely
Compact and interact together- conformational change
No conformational change, just come together
Extended interaction- residues reach and contact each other- optimise contact= most likely
What is alphafold
Take knowledge of sequence and structure and builds a protein model
Early days, can possibly be applied to protein complexes
Alphafold in yeast-two-hybrid being used to predict complexes example
Humphreys et al. 2021
Many structures were predicted
Not good for transient interactions
Better prediction for large stable complexes
Some binary complexes that were found actually normally fit into a complex but needs more info
Again, reiterates early days
Things predicted to be at the interfaces in human protein interaction network
Predict disease causing mutations are at interface and phosphorylation sites which lead to regulation of protein interactions
What is a pDockQ based on
Proximity- number of residues in close proximity, beta carbons within 10 angstroms, greater number in close proximity, more likely to be a contact
PLDDT- alpha fold prediction score, how good the prediction is
Orthogonal data eg cross-links in protein interaction determination
Cross linking reagents can be used, bifunctional with cross-linker and functional ends which bind to different proteins
Mass spec can be used to identify cross-linked residues
Can sort cross-link data based on pDockQ, scores with higher confidence observed more significant cross-links (spacer less than a certain number of angstroms)
Location of proteins and affinity of interactions
Obligate oligomers- high affinity
Non-obligate permanent PPIs dont need to interact but when they do have high affinity
Non-obligate triggered transient PPIs- high affinity, regulatory
Non-obligate co-localised PPIs- moderate affinity, regulatory
Non-obligate weak transient PPIs- dependent entirely on concentration, low affinity
Ways and approaches to validate data out of high throughput
Qualitative and quantitative
GST pulldown (qualitative)
Isothermal calorimetry (quantitative)
Surface plasmon resonance SPR (quantitative)
Features of protein-protein interaction domains
Independently folded, 35-150 aas, can still bind target if expressed independently
Binding properties of isolated domain reflect those of intact proteins
N- and C-termini close in space with ligand binding site on opposite face
Folding allows domains to be connected without disrupting function
What are SH2 (~115) and SH3 (~300) domains in human genome
Src homology
First discovered cancer causing protein, showed cancer caused by mutations
Regulate kinase activity (SH1), has Tyr527 on end of gene commonly mutated in cancer
Features of proline binding motifs making them play a key role as a docking site for signalling proteins
Unusual shape of pyrrolidine ring
Constrained dihedral angles
Substituted amide nitrogen
Relative stability of cis isomer
How does isothermal calorimetry work
Reaction cell filled with protein solution and injected syringe filled with ligand solution
Small volumes of ligand injected into cell triggering binding reaction
Exothermic- samples becomes warmer and causes downward peak sequence
When binding saturation reached, remaining heat effects (if present) due to mechanical and dilution
Area under peak plotted versus molar ratio, gives Ka
How does surface plasmon resonance (SPR) work
Measures based on kinetics, measures kon and koff rates
One substrate immobilised on a surface, the other run over top
Reflected light tells how much is bound, can measure the association and dissociation
Compare sensorgrams for different interactions
PPI paper 1 and how they validated novel interactions found at hugh confidence
Burke et al. 2023
Cross-linking data between proteins
Disease causing mutations prevalent at interfaces
Phosphorylation sites at interfaces
Recognition of prolines for proline binding motifs
Often stretches of proline (polyproline type II helices) are favourable for binding
3 residues per turn, ring and carboxyl regions regularly positioned- backbone restricted
Carbonyls are free so make no intra-molecular H bonds and are free to make interactions which binding proteins take advantage of for interaction
SH3 domain and proline rich regions
SH3 has two antiparallel B-sheets and two variable loops (RT and n-Src)
Binds ligand in polyproline II helix
Recognition relies on N-substitutionm proline
Variability in loops confers some selectivity
More on how SH3 binds proline-rich regions
One or two residues at end of PPII helix recognised by loops eg Arg- loops required for specificity
Two xP grooves for proline binding
Three aromatic residues make areas for proline to bind (xP grooves)