Analysis of Gene Expression Flashcards
What is differential gene expression?
Different cells expressing different genes
Development of a single cell into a complex organism depends on formation of different cell types
Genes are expressed at different levels
What is differential gene expression caused by?
Regulatory proteins
It is caused by changes in expression of an unchanging set of genes
Cells make selective use of their genes – they turn expression on or off depending on cues
This is controlled at many stages - most common control point is transcription - by regulatory proteins
Production of more RNA = more active protein
What do regulatory proteins do?
Genes have long control regions often >10,000 bp
These bind regulatory proteins - enhancers/repressors
The proteins bind to short stretches of DNA
They work synergistically to amplify transcription and therefore expression
Different cells have different regulatory proteins
These affect RNA polymerase binding and transcription
How can we measure gene expression?
We can look at abundance of mRNA or protein levels
We do this in more than one cell type to compare the genome
What are some methods of detection of expression looking at abundance of mRNA?
Northern blot analysis Quantitative PCR Microarrays RNA sequencing In situ hybridization
They all rely on nucleic acid hybridisation (driven by Watson-Crick base pairings)
We know the sequence of the gene so we can design a complementary probe
Detection of expression looking at abundance of mRNA: describe northern blot analysis?
Harvest cell types RNAs separated by SDS-PAGE RNAs are transferred to a filter A labelled probe hybridises to the mRNA The probe is detected and quantified
This is a very direct method - can be quantative
The level of darkness of a band indicates more material that it could bind to - therefore more expression
This is quantified by a machine measuring the density of material there
Detection of expression looking at abundance of mRNA: describe quantitative PCR?
Harvest RNA from different cell types
Reverse transcription of mRNA into cDNA
Quantify PCR amplification with either fluorescent primers or dye that binds dsDNA
Rate of product appearance relates to concentration of mRNA
Benefits: quantitative, rapid and can detect several targets in one tube
Detection of expression looking at abundance of mRNA: describe micro arrays?
DNA oligo representing 1000s of genes are immobilised on chip
Cell mRNA copied to cDNA using reverse transcriptase and then labelled with red or green dye
cDNAs hybridised washed and scanned
Red = expression in A
Green = expression in B
Yellow = both A and B
Can detect and quantify thousands of transcripts simultaneously
Detection of expression looking at abundance of mRNA: describe RNA sequencing?
Determine abundance of all RNAs in a cell Harvest total RNA Select/amplify mRNAs Perform RNA sequencing Compare expression levels of all genes
We can look at every single gene expressed in an organism - but not cheap
Detection of expression looking at abundance of mRNA: describe In situ hybridisation?
Very different method:
Tissue is prepared by fixing and permeabilization
Addition of labelled DNA or RNA probe (fluorescently tagged)
Probe detection by microscopy
Benefits
Can simultaneously show abundance of transcript expression in all tissues of an organism
No need to separate out all tissues of an organism – they can remain in situ
Reveals information of both mRNA abundance and location
What are some methods of detection of expression looking at abundance of protein?
2D gels - MS
Specific antibodies
Detection of expression looking at abundance of proteins: describe 2D gels followed by mass spectrometry?
Isolate specific cell types Lyse cells - release proteins Separate proteins on 2D protein gel They move to their isoelectric point (pH where their charge becomes 0) The gel had a fixed pH gradient Then separate on another gel via size
Stain the separated proteins - Coomassie blue dye
They form a pattern of spots
We can compare spot patterns - possible to identify differences in protein expression
Identify by mass spectrometry - orbitrap
Detection of expression looking at abundance of proteins: describe specific antiboidies?
Isolate specific cell types Lyse cells - release proteins Separate proteins on a protein gel Transfer proteins to a membrane Probe membrane with an antibody Detect and quantify the antibody
What can we use for protein expression detection is there is no easy marker?
Reporter gene
If we can’t detect the expression we could add a reporter gene
This will produce reporter mRNA and therefore a reporter protein
Common regulator genes - GFP, B-galactosidase, B-glucuronidase and luciferase
Where reporter protein is detected, the gene is being expressed
How is can we find how gene expression regulated?
Identify the gene regulatory sequences
Identify gene regulatory proteins
Give an example of a gene we can identify the gene regulatory sequence and the gene regulatory proteins?
Even skipped gene (Eve)
Eve is essential for development of Drosophila
It helps define formation of the segmented body plan
Acts very early in the organization of the embryo
Eve expression occurs in 7 discrete stripes
Stretches for 20 kb; >7 kb upstream and >13 kb downstream
5 regulatory sequence modules control expression in 7 stripes
Stripe modules exert control of Eve expression by interacting with over 20 regulatory proteins
The 480 nt stope 2 module - binds 4 regulatory proteins: hunchback+, bicoid+, Kruppel- and Giant-
To determine this - Eve ORF was substituted for a reporter ORF to see where the gene was expressed
How do we identify the gene regulatory sequences
Make deletions through: restriction enzyme digestion
Site directed mutagenesis
Gene synthesis
Describe identification of gene regulatory sequences through restriction enzyme digestion?
Sequence entire regulatory region
Generate restriction map
Remove sequences by double restriction enzyme digests
Introduce the altered genes into Drosophilia eggs
Removal of BstEII - BssHII fragment abolished stripe 2 expression
These 480 nts contain all signals needed for stripe 2 expression
Delete all other sequences from the gene regulatory regions
Describe identification of gene regulatory sequences through site directed mutagenesis and gene synthesis?
Site directed mutagenesis: Can make precise nucleotide changes to define required sequences: Precisely shorten region Remove internal sequences Make single/multiple nt changes
Gene synthesis:
You can make any sequence you want but it is very expensive
Using these methods alone or in combination allows the regulatory sequences to be defined
How do we identify the strip 2 (eve) gene regulatory proteins?
Two methods:
Electrophoretic mobility-shift analysis
Affinity chromatography
Describe identification of gene regulatory proteins through electrophoretic mobility-shift analysis?
- dsDNA fragment (20-35 bp) containing a protein binding site is prepared (end-labelled)
- A DNA probe is incubated with a protein fraction - protein-DNA complexes form
A non-specific competitor is also added to eliminate non-specific interactions - Run in gel electrophoresis - under native conditions to free the bound probe
- The gel is dried and position of the probe is detected using X-ray film
The probe + cell fraction won’t move as far as the probe alone on the gel
Antibody can also be used to find out the presence of a protein
Due to the extra mass of the antibody (slower) this produces a super shift on the gel
If no antibodies are available, use MS analysis
It is very simple and highly sensitive
Describe identification of gene regulatory proteins through affinity chromatography?
- Make a DNA fragment of the regulatory region
- Attach to a solid matrix - eg agarose
- Add cell lysate to column -> regulatory proteins bind the immobilized DNA
This will be washed and eluted with salt - Analyse by MS ID protein
How can we determine where regulatory proteins bind exactly?
2 main methods: DNA footprint analysis Chromatin immunoprecipitation (ChIP) analysis
Describe identification of location of gene regulatory proteins through DNA footprint analysis?
This allows us to identify nucleotides that are in contact with a DNA binding protein
Purify the binding protein
Incubate the DNA with the binding protein - forms a ‘hot’ regulatory region
The binding protein is in excess of the probe
Add DNase I - this binds to the minor groove and produces random nicks (only cuts once on a strand)
As this produces random cutes this can result in gaps/bands varying in intensity
The binding protein will protect its own binding site
Wash away the binding protein
Look at the sizes of the remaining DNA fragments
The products are separated in electrophoresis to reveal the ‘footprint’
There is a gap in the sizes of the labelled DNA fragments
The gap is where the binding protein binds
We can generate binding curves and equilibrium constants from this data
Describe identification of location of gene regulatory proteins through Chromatin immuno-precipitation (ChIP)?
Allows in vivo identification of DNA sequences associated with proteins (DNA is in native chromatin state)
Treat in vivo cells with formaldehyde or UV to cross-link - creating a temporary physical bond between DNA-protein and chromatin-protein complexes
Chromatin fragmentation - cleave DNA into 300 bp fragments by sonication, restriction endonucleases or micrococcal nuclease
Immunoprecipitation - of the protein of interest and its associated chromatin fragments - using a specific antibody
This antibody needs to be compatible - even after cross-linking, which could adversely affect it
DNA capture/isolation - antibody-chromatin complexes are collected using beads containing proteins or secondary antibodies
A series of wash steps reduces non-specific interactions
Isolate - heat (remove cross-links), digest antibody/chromatin proteins with protease K and purify the DNA with affinity chromatography
Compare fragment sequences - deduce binding protein binding site
Do this through an qPCR and then alignment
What can we conclude about Eve after these identification experiments?
This allows us to understand how Eve expression is regulated in terms of inducers and repressors
Activators bound = repressors can’t bind = Eve expressed
Repressors bound = activators can’t bind = Eve not expressed
This is found in strip 2 for Eve as neither repressor is activated/bound
How do we determine gene function?
We produce mutants where gene function has been perturbed Two approaches: Forward genetics (Classical genetics) First - look for a phenotype Second - look for a genetic change Slow laborious costly
Reverse genetics
Second - look for a phenotype
Rapid and precise
If a gene can be linked to a phenotype then this can reveal clues about its function
Describe reverse genetics?
This can specifically perturb gene expression in two ways:
Genetic - associated with irreversible changes to the genome
E.g. nt/gene deletion, promotor deletion, splicing mutant or poly(A) mutant
This perturbs expression - some might not even produce mRNA
Epigenetic - No changes to the genome (changes elsewhere)
E.g. Regulatory proteins or RNA interference
What were some older strategies that were used for genetic/epigenetic changes in organisms?
They used to rely on introduction of altered genes and homologous recombination to swap genetic material
There are drawbacks in mutational efficiency and specificity
Transposons - jump into the area you want to change (could go to the wrong place)
Insert copies into a fused egg (single cell - pronucleus)
Recombination can be unpredictable
If diploid - must replace both WT genes
Efficiency is low and cost is very high
Describe RNA interference molecules - used in epigenetic changes?
Native pathway of RNA degradation and defence found in many higher organisms - plants, animals and fungi
RNAi (RNA interference) - 20-30 nt noncoding RNAs, with associated proteins, can control the expression of genetic information
Controls vital processes e.g. Cell growth, tissue differentiation, cardiovascular disease, neurological disorders and many types of cancer
Some believe we owe are sentience (capacity to feel, perceive, or experience subjectively) to small RNAs, as increased miRNAs in a genome seems to correlate to the complexity of an organism
3 related pathways that share the same central complex:
siRNA (small interfering RNAs)
miRNA (micro RNAs)
piRNA (Peewee interacting RNAs)
What is involved in the formation of RNAi?
Involves the generation of double stranded RNAs that can target existing mRNAs in a cell and block their expression
All 3 pathways start with dsRNA and involve a complex of RNA and proteins called the RNA-induced silencing complex (RISC)
RISC silences a target mRNA via degradation and/or translational repression
It induces non-endonucleolytic translational repression before or after initiation, followed deadenylation and degradation
Describe the formation of miRNA?
Transcribed as ssRNA from host cell genome - pri-miRNA
They fold into dsRNAs, greater than 1000 nts long
Processed by cellular machinery to short dsRNAs - Microprocessor
Microprocessor contains Drosha - an RNA III family enzyme and can perform endolytic cleavage
Dicer + miRNA + dsRNA binding protein (dsRBP) then recruit argonaute to form the RISC loading complex
Guide RNA selected, passenger RNA ejected = dsRNA is separated
Guide RNA + argonaute = RISC
Guide RNA binds target mRNA at the 3’ end by base pairing – RISC mostly blocks translation
Describe the formation of siRNA?
Source of dsRNAs is exogenous, often viral
dsRNAs are perfectly complementary
Processed by cellular machinery to short 21-25 nt dsRNAs - only ‘Dicer’
Dicer + miRNA + dsRNA binding protein (dsRBP) then recruit argonaute to form the RISC loading complex
Guide RNA selected, passenger RNA ejected = dsRNA is separated
Guide RNA + argonaute = RISC
Guide RNA binds target mRNA by base pairing – RISC mostly cleaves it
Argonaute has nuclease activity
What are the key differences between formation of miRNA and siRNA?
- Initial source and structure is different
miRNAs are transcribed from a genome, that form partially double stranded RNAs
siRNAs are exogenous, and usually perfectly complementary dsRNAs - The way the guide RNA binds the target is different
miRNAs bind target mRNAs with partial complementarity, contains mismatches and extended terminal loops
siRNAs bind target mRNAs with perfect complementarity for base pairings - The activity of argonaute is different
miRNA - argonaute binds mRNA and blocks its translation – rarely cleavage
siRNA - argonaute binds mRNA and most often causes its degradation
How is RNAi used in the lab and beyond?
RNAi is used for gene silencing (knock out gene expression) in order to determine gene expression
To target a single gene
To target multiple genes
As a therapy for human disease
How does RNAi form dsRNA?
Synthetic RNAi triggers are generally perfectly base-paired dsRNAs or short hairpin RNAs
Transfecting a plasmid into the cell expressing a short hairpin RNA (shRNA)
The hairpin loop is formed so it is complementary
The plasmid will express a shRNA with the sequence of the target gene - that will fold into a dsRNA, before being incorporated into RISC
What some roles of RNAi regarding virus entry?
Low throughput example:
Transfect siRNA specific for ARCN1 gene into cells - this is important for flu virus entry (uses COPI vesicles)
Infect with eGFP-influenza virus
Look for phenotype by microscopy
High throughput example:
Transfect siRNAs specific for ALL 19K human genes into cells in 20x 96-well plates
Infect plates with eGFP-influenza virus
Look for loss of eGFP signal by screening all wells
Identified 12 out of 19,000 cellular genes needed for virus entry
How is RNAi used in the clinic?
Patisiran – an siRNA packaged into a lipid nanoparticle
FDA approved in 2018
Treatment of a human disease - hereditary transthyretin (hATTR) amyloidosis
siRNA delivery by injection every 3 weeks
Targets the 3’ UTR of the TTR gene
Blocks mutant TTR production by 70%
High efficacy in controlling hATTR
Give a summary of RNAi?
RNA interference is an epigenetic mechanism of gene expression perturbation that effects mRNA translation efficiency
RNAi relies on generation of double stranded RNAs, with one strand – the guide RNA – being complementary to a target mRNA
Incorporation of the guide RNA into RISC leads to perturbation of gene expression from the target mRNA
What is the CRISPR-Cas9 system?
CRISPR - Clustered Regularly Interspaced Short Palindromic Repeats
They are repeat sequences, next to short sequences
A system for editing and regulating genomes
CRISPR are transcribed into RNA - crRNA - which is bound by Cas proteins
These complexes bind to dsDNA - at a complemenary genomic sequence, next to a protospacer, and cleave the DNA
Here DNA may be inserted by: nonhomologous end-joining or homology directed repair
tracrRNA is also needed to identify the crRNA to Cas9
What is the basis of the CRISPR mechanism?
Many prokaryotes have a genetic locus where short DNA fragments of past invaders are integrated - this DNA sequences is known as ‘protospacers’
Sequences are stored between the short palindromic repeats (SPR) of the CRISPR locus
These fragments are called spacers and serve as vaccinations to combat future infections
These RNAs transcribed from these sequences, associate with Cas proteins (CRISPR associated proteins) eg. Cas9
Cas proteins process these RNAs into crRNAs (CRISPR RNAs)
The crRNA joins with an RNA called tracrRNA (transactivating CRISPR RNA) to form a small guide RNA (sgRNA)
Cas + sgRNA seek out identical protospacer sequences to direct their destruction by nuclease attack (cleave it)
What are some critical features of Cas and the crRNA?
The sgRNA allows Cas9 to identify the protospacer target DNA through base-pairing
Cas9 cuts at a specific sequence next to the protospacer called the protospacer adjacent motif (PAM) – the PAM for Cas9 is NGG
Cas9 cuts both strands of the protospacer target to generate a double stranded break - 3bp on the 3’ site upstream of the PAM
This double stranded break is the key to gene editing
What is the end mechanism of CRISPR?
Editing relies on using homologous recombination to repair the double strand break
- Cas9/sgRNA causes a ds break by the PAM sequence
- Supply a ‘repair template’ with the desired sequence
- Homologous recombination occurs to replace host DNA with repair template
- Genes can be knocked out by eg. adding stop codons