Lecture 1: Protein sources Flashcards
Why do we need pure proteins?
Many biochemical experiments require pure protein samples for in vitro studies.
- Reconstituting functional systems from components
- Analysing enzyme function
- Determining structure
- Determining binding properties
It is important that we don’t mess it up.
• An impurity may form a crystal, so we waste time on the wrong protein.
• A bad purification strategy can lead to denaturation as well.
• We could also accidentally get rid of an impurity which has an important function
What do we need to consider before purification?
Before purifying a protein there are some things we must consider.
- Is it single domain or multi-domain?
- Is it an intracellular, extracellular or membrane protein?
- Is it folded, natively unfolded or partially folded?
- Post-translationally modified?
What sources can we use for proteins?
Some proteins are very abundant in specific natural sources.
- Haemoglobin is found in red blood cells.
- Lysozyme is found in egg white.
- Acetylcholine receptors found in the electric organs of rays.
Unfortunately, most proteins are not sufficiently abundant in natural sources and we make them in heterologous expression systems. There are four main ones:
- E.coli: very cheap and easy. It will often do the job, so it is the first port of call.
- Yeast: eukaryotic so it can make more complex proteins. It is also fairly cheap.
- Insect cells/baculovirus: a more complex eukaryotic system which often works for mammalian proteins.
- Mammalian cells: used for human/mammalian proteins if nothing else works.
How can E. coli be used to create proteins?
E. coli can be grown in large quantities in a small amount of time with inexpensive media, with relative ease and in lab conditions.
- We insert the gene of interest using a plasmid. The gene must not have introns. We PCR the gene from genomic DNA then insert using restriction enzymes.
- If there are introns we have to use cDNA made from mRNA and RT. Or can we make a synthetic gene designed without introns.
- The plasmids must have a promoter, an origin of replication and a selectable marker.
- The most common expression system is the T7 expression system. It uses a T7 promoter and T7 polymerase.
- The polymerase expression is repressed by a lac repressor. Addition of IPTG stops the repression and initiates protein expression.
- The plasmid is transformed into the E. coli using antibiotic selection. The cells are grown, either in shaker culture or in a fermenter, to make lots of them.
- When the cells reach an optical density when they are growing in log growth phase, the T7 promoter is induced using IPTG.
What are some issues with E. coli? How can they be overcome?
- They very rarely glycosylate or add membrane anchors (e.g. GPI anchors or palmitoyl groups).
- E. coli proteins also tend to be simpler and have one domain. They sometimes struggle to make more complex proteins.
- E. coli membrane compositions is different from mammalian cells, they don’t have cholesterol, which may be a requirement for functional membrane proteins.
- Due to different codon usage between organisms, E. coli sometimes struggles to translate eukaryotic proteins. This can be overcome by adding rare tRNAs to an E. coli strain on a plasmid. Rosetta cells are one commonly used strain which counteract this.
- Another strain which is used is origami, which has an oxidising environment which allows disulphide bonds to form correctly, unlike the reducing environment of normal E. coli. Origami uses thioredoxin reductase and glutathione reductase mutants.
How can yeast be used to create proteins?
The main species which are used are S. cerevisiae and P. pastoris. They are eukaryotic, so they can make modifications like palmitoylation and glycosylation.
The genes are added using plasmids. The vectors can be inducible or constitutive. AOX1 promoters can be used as a methanol inducible system.
How can baculoviruses and insect cells be used to create proteins?
A more complex system involves using insect cells (sf9 or high five cells).
• These cells can perform complex folding and post translational modifications.
• They are susceptible to baculovirus infection, which naturally cause the production of large amounts of polyhedrin, a packaging protein. We can replace the polyhedrin gene with our gene and drive large amounts of protein production.
• The viruses are made by co-transfection of a plasmid containing the gene of interest and a linearised baculovirus DNA into the cell. Recombination leads to new viruses.
• The cells are kept shaking in roller bottles, they take 24 hours to divide. They can be infected with the virus and harvested after 3-4 days.
How can mammalian cells be used to create proteins?
Mammalian tissue culture cells are often the best way to generate folded human proteins. There are different types of cells which we can use.
- HEK 293: Human embryonic kidney cells.
- COS: African green monkey kidney cells.
- CHO: Chinese hamster ovary cells.
Genes are added with plasmids.
• The plasmids have E. coli ORI and antibiotic resistance genes to allow bacterial manipulation. The gene uses a CMV promoter.
• The plasmids can be transferred using chemical methods such as liposomes. The cells are grown in roller bottles if adherent, or in suspension culture.
• They are grown for a few days and then harvested for the protein. The glutamine synthetase gene is used as a selectable marker.
• Proteins can be expressed in either an intracellular or secreted form. If a protein is normally extracellular and has disulphide bonds, it should be secreted. This is done by adding an N-terminal secretion signal before
Why do we make designer proteins and how do we do it?
There are many ways in which we might want to modify our protein.
• A common addition is fusion tags, which help with purification, as the proteins will bind to specific resins. For example, we use histidine, GST or MBP (maltose binding protein).
• They can help the protein to be soluble (MBP) or insoluble (KSI).
• A protease cleavage site is added to the gene, so the tag can be removed.
• We might want to remove parts of the protein. For multi-domain proteins we may only want to create a single domain.
• We might also want to remove flexible or disordered parts which may interfere with function.
• We can use bioinformatics to predict which part of the protein we want to make and which bits we should remove.
• Disorder prediction serves like RONN and PONDR can predict ordered core regions, homology or fold prediction can find specific domains, i.e. if we want to study just a specific domain from the protein of interest.
• We may also sometimes remove post-translational modification sites, as they may interfere with crystal formation. We can find them with bioinformatics and removed them with point mutations.