Expression of Recombinant Proteins in E.coli Flashcards
Why study proteins?
- Proteins are the workhorses of the cell: structural, transport, catalysis etc
- Knowledge of protein function and action can help with e.g.: drug design, engineering enzyme for use in e.g. biocatalysis
How do we study proteins?
- Purification: biochemical/biophysical characterization of isolated proteins, structural studies (X-ray, NMR, cryo-EM)
- In vivo studies: protein labelling techniques
- Cloning and overexpression of proteins: higher yield of purified proteins, possibility to mutate amino acid residues specifically (engineering function, studying function)
Protein-based therapeutics
- Protein-based therapeutics include: hormones, cytokines, vaccines, monoclonal antibodies
- 1/3rd are produced in E. coli
- Currently there are: 140 protein-based therapeutics have been approved and
500 are in clinical trials
Protein Purification
What is it?
- Purify a single protein from a mixture of proteins
Why is it needed?
- Separation from proteins with similar function to measure activity/function of one particular protein
- Comparison of mutant proteins
- Structural studies (X-ray, NMR)
Protein Purification – pre-DNA Technology Era
- Proteins were purified from the native organism/tissue
- No overexpression, most proteins typically occur at <1% of total protein: purifying Cytochrome C from ca. 1 kg Horse Heart yields 3-4 mg pure protein
- With rise of recombinant DNA technology and next generation sequencing these methods are rapidly disappearing from modern research
- Other developments: new overexpression systems, Tag-based protein purification, cell free expression systems
- ‘old-fashioned’ methods are still used for proteins that require extensive and very specific post-translational modifications
Protein Production & Purification - DNA Technology Era
- Developments in High throughput genome sequencing: coding sequence of our protein of interest is typically known
- Gene of interest can be amplified by PCR and cloned into an protein overexpression vector: inducible overexpression results in high levels of protein in dense cultures, possibility to manipulate the gene, facilitation of the introduction of purification tags
Protein Production In a Nutshell
- DNA of the protein-encoding gene (geneAB) is transcribed into mRNA
- mRNA is translated into protein (ProteinAB) by the ribosome
- Several regions upstream of geneAB are crucial for both transcription and translation
Elements Required for Transcription/Translation
- 16S rRNA 3’-end complements the Shine-Dalgarno sequence, also known as ribosome binding site (RBS)
- Insures that the Ribosomal complex binds in the correct location of the mRNA
- Both elements need to be precisely oriented relative to the ATG start codon
Restriction endonuclease–based cloning
- Create PCR product with restriction sites on either end
- Digest PCR product and plasmid vector
- Ligate digested PCR product and digested plasmid vector
Recombinant Protein Production in E. coli
- PCR (geneAB)
- restriction endonuclease-based cloning
- geneAB and purification tag join plasmid vector suitable for overexpression in E. coli
- transformation
- induce enzyme production in E.coli
Inducing enzyme production in E.coli
- Inoculate with E. coli containing overexpression construct and grow at 37oC to mid-exponential phase (OD600 ≈ 0.5)
- Induce expression of protein of interest by adding inducer followed by growth at xx oC (typically between 20-37 oC) for 4 hours to overnight
Requirements for production of a protein starting from a gene (i.e. DNA)
- Transcription: RNA polymerase must be able to bind to promotor and transcribe gene into mRNA
- Translation: mRNA has to be translated
- How is it achieved?: Insert gene of interest in between upstream regions needed for starting transcription (Promotor) and translation (Shine-Dalgarno, RBS) and terminating transcription (Terminator)
Plasmid Vectors for Protein Production in E. coli
- Plasmid vectors for protein production typically contain a multiple cloning site (MCS) at the correct location relative to those 3 elements
- Regulatory elements for transcription are exploited to switch protein production on/off by addition of a specific small molecule to growth medium
Promotor Systems for Protein Production in E. coli
- Integrated into a specific protein expression plasmid
- based on proteins that prevent RNA-polymerase binding to their associated promoters when inducer molecule is absent
- Addition of specific small molecules to growth medium results in repression (small molecule=repressor) or induction (small molecule=inducer) of protein overexpression
- If repressor can be consumed by bacterium in which the expression plasmid propagates so-called autoinduction can be used
Examples of promotor systems
- lac-repressor: based on the regulatory sequences and proteins of the lac-operon
- PBAD promotor: based on the arabinose inducible arabad system
- tet-repressor: based on the tetracycline repressor
Promotor Systems -PBAD
- In absence of l-arabinose, the AraC protein is present in a form that doesn’t induce binding of RNApol to PBAD promotor site
- When l-arabinose binds to AraC, it causes a conformational change ultimately facilitating binding of RNApol to promotor site
- High levels of d-glucose in growth medium indirectly repress expression (via cAMP binding to CAP), even in the presence of l-arabinose
- During growth E. coli consumes d-glucose, causing expression to be induced by the presence of l-arabinose (the latter is not consumed)
Promotor Systems - tet-repressor
- Repressor-based systems regulate gene expression by effectively blocking the binding of RNA polymerase II (RNApolII) to promotor, preventing transcription
- Systems consist of a repressor protein that binds to its operator sites as a dimer
- operator site is located in such a way that it prevents RNApolII from binding to promotor site - no mRNA is made
- Addition of inducer causes a conformational change in TetR-dimer that prevents it from binding to its operator sites
- RNApolII can now bind to the promotor
- RNApolII can now bind to promotor and transcribe orfAB into mRNA, producing ProteinAB
lac operon
- 3 structural genes (produces a protein): lacZ – encodes for B-galactosidase, lacY - lactose permease, lacA - thiogalactoside transacetylase
- 3 regulatory genes/sequences: lacP - Promoter, lacI - Repressor, lacO - Operator
How to use Iac operon for recombinant protein expression?
- Replace lacZ-lacY-lacA cassette with GOI
- Induce expression by addition of lactose or synthetic analogue Isopropyl β-d-1-thiogalactopyranoside (IPTG)
- Presence of d-glucose actually enhances repression by LacI (indirectly, similar to PBAD system), even in presence of inducer
- IPTG isn’t metabolised, whereas d-glucose is, so autoinduction can be used for lac-based systems as well
Autoinduction for Protein Production in E. coli
- Inoculate medium with consumable repressor (d-glucose) and inducer (e.g. IPTG) with E. coli containing overexpression construct and grow at xx oC (typically between 20-37 oC)
- During growth, repressor (d-glucose) is consumed whereas level of inducer (e.g. IPTG) stays the same
- Ultimately all repressor (d-glucose) is consumed (starting levels are chosen in order to time this during mid-exponential phase) and overexpression will start without ‘automatically’ (normally inducer is added manually)
pET-Series Vectors
- use a ‘two-step’ expression system
- Gene cloned in plasmid MCS (insert) is transcribed by a recombinant T7 RNA polymerase
- Expression of T7 polymerase from chromosomal location that is under LacI control
- Protein is only expressed in E. coli strains that express T7 RNA polymerase
- Most commonly used: E. coli BL21 (DE3)
- All ‘genetics’ are carried out in different E. coli strains more suited for DNA purification
pET-Series Vectors: Expression
- LacI repressor protein blocks transcription at 2 points: lacO site on host genome thus repressing transcription of the T7 RNA polymerase gene, lacO site on the pET-vector thus repressing transcription of the recombinant gene
- However, even in absence of inducer (IPTG) there can be some expression of T7 RNApol from the lac promotor on E. coli BL21(DE3) genome
- Expression of T7 lysozyme from pLysS plasmid actively blocks T7 RNApol, fixing the ‘leakiness’ of the system
- IPTG induction produces excess T7 RNApol, overcoming block from limited amount of T7 lysozyme
General Rules for Designing PCR Primers
- Must be 18 nt in length
- Must finish at the 3’ end in either G or C
- 𝑇𝑚 (midpoint melting temperature in oC) values of the primer pair must be within 5oC of each other
- annealing temperature for PCR is 5oC less than the lowest 𝑇𝑚-value of the primer pair
Tm equation
- 𝑇𝑚=69.3+0.41×𝐺𝐶%−650/(𝑝𝑟𝑖𝑚𝑒𝑟 𝑙𝑒𝑛𝑔𝑡ℎ (𝑛𝑡))
Fundamentals of Primer Design
- ssDNA will only hybridize/anneal in an antiparallel fashion
- DNA can only be synthesised biologically in a 5’ to 3’ direction
- Restriction sites can be artificially introduced with the primers
Temperature Program for PCR
- Repeated cycles of denaturation (94oC), annealing (??oC) and elongation (72oC)
- annealing temp for PCR should be set at 5oC below lowest Tm-value of the primer pair
- When introducing non-matching bases at 5’- and 3’-ends (for RE sites), the optimal Tm initially depends on the part of primer that anneals to GOI
- After about 5 cycles the optimal Tm shifts due to the fact that PCR products from previous cycles now serve as templates for further cycles
- ‘matching’ part of primers now includes restriction site