Methods for gene expression quantification and next generation sequencing Flashcards
How does qPCR with a reporter probe work?
Quantitative real time polymerase chain reaction used to measure how much DNA is present
qPCR with a reporter Probe tests one gene at a time
Retro transcriptase converts mRNA of gene to cDNA which contains only coding sequences
Fluorescent probe binds to target cDNA if present
Probe contains fluorophore and quencher
Quencher stops fluorophore from releasing light
When taq polymerase starts to replicate DNA the probe is destroyed and the fluorophore is released
Fluorescence is no longer quenched and can be quantified
The number of copies after several rounds of replication are directly proportional to initial mRNA level
Measure fluorescence at every cycle
How is gene expression measured with qPCR (how are the results interpreted?
Measure fluorescence at every cycle, usually 40 PCR cycles
DNA in each sample is amplified through cycles of PCR so fluorescence increases
More DNA in sample = threshold passed sooner/after less cycles = higher mRNA content= higher gene expression
What does the threshold value tell us in qPCR results?
Threshold level = set above level of background fluorescence, intersects curve at start of exponential phase, based on amount of light produced in an unspecific reaction with no primers
Ct (threshold cycle) = spot where curve intersects threshold line. Shows number of cycles it took to detect a real signal from samples
Variation or expression difference is ratio between Ct values
In down syndrome, threshold is passed later than in control so contains less DNA
Look at image in notes
How are qPCR results normalised?
Have to compensate for technical differences so results must be normalised
Control may have started with 2x more cells so would get 2x more RNA so would have to divide control measurements by 2
Housekeeping genes: essential, low mutation rate, highly/always expressed at the same level
Should be expressed the same amount in every cell
Ct values of HK genes in control and experiment are used to normalise qPCR
If in graph have one HK gene at level 4 and one at level 8, then started off with 2x DNA
2 levels of control: 1) count cells 2) measure amount of RNA (may differ due to inaccurate pipetting)
limitations and advantages of qPCR
Advantage: qPCR is fast and cheap
Limitation: limited in how many genes can be tested a once. Can only check one gene at a time so only works for small samples (not genome wide)
What is the major advantage of microarrays?
Allows for detection of thousands of genes at the same time (unlike qPCR)
How do microarrays work? What is an Affy GeneChip?
Affy GeneChip is a very small chip containing 15,000 - 20,000 genes
Each gene has 15-20 pairs of probes synthesised on the chip
Probes = short, single stranded DNA sequences complementary to a specific gene
RNA is extracted and converted to cDNA by reverse transcription
cDNA is labelled with a fluorescent dye
cDNA hybridises with complementary DNA probes
Intensity of fluorescent signals is measured at each spot
Intensity = amount of cDNA bound = expression level of gene
How is expression on the AffyGeneChip measured with a computer?
Analyse chip with computer
Get a signal, detection and p-value
Signal: expression measurement for the corresponding probe
Detection: determines the absolute call for a measurement (A=absent, M=marginal, P=present)
Two slides are required, one slide used for the experiment and one for the control
Expression fold change calculated by comparing probe readings on both slides
Limitations of microarrays
Data is very noisy
Expression levels are determined by a spot of light against a noisy background
Probes are not available for all genes - approx. 75-80% of all human genes
Genes with low expression may not be detected
Data required large degree of statistical manipulation
Result only shows a gene is expressed but gives no information about which transcript (can’t distinguish between diff isoforms of a protein)
How does illumina sequencing work?
Based on DNA libraries
DNA is cut into fragments and adapters (short DNA fragments 25-50nt) ligated to ends of DNA
Attach DNA to flowcell
2 types of oligos are attached to the flow slide
Oligo is complementary to adaptor region on one of the fragments
Add library onto flow cell: adaptors anchor DNA molecule onto flow cell
Bridge amplification: Strand folds over, adaptor region hybridizes with the second type of oligo on the flow cell
Can have a sequence with one primer (single end sequencing = one reading per fragment) or paired sequence with two primers
Fragment becomes double stranded: PCR creates a complement of the fragment
Denaturation: Bridge is denatured resulting in 2 single stranded copies that are attached to the flow cell
Clonal amplification of all fragments: Process is repeated over and over again forming millions of clusters
Sequencing: Polymerize the complementary strand using fluorescently labelled nucleotides
Nucleotides are reversible terminators (only one can be added)
In each cycle, only one base can be added - to add another nucleotide, terminator has to be blocked
After addition of each nucleotide, clusters are exited with a laser, emitting a fluorescent signal
Each colour corresponds to a particular base that has been added
How does RNA-seq work?
RNA-seq uses NGS technology to measure gene expression
mRNA isolation
Illumina sequencing
Align sequences against reference genome (against each gene/exon)
Number of reads mapping to each gene is determined
Corresponds to gene expression level
If one sample has more RNA = get more reads so also need normalisation here
Advantages of RNA-seq
Very accurate information on gene expression levels, even genes with low expression
Can identify exact transcript being expressed
Can get 1 million reads for one sequencing event
Can identify unknown transcripts with novel splice sites (discover different isoforms of a protein
How is RNA-Seq data normalised?
Compensate for technical differences with sequencing
Normalisation scales the read counts to they can be compared
Two main ways to normalise are: TPM (transcripts per million reads) and RPKM (reads per kilobase of transcript per million mapped reads)
How is rRNA removed from samples during RNA-seq and why is it important?
90% of RNA in cell is ribosomal RNA (rRNA)
Need to eliminate rRNA because we only want RNA from genes
rRNA is not polyadenylated, mRNA is polyadenylated
Oligo dT probes bind to dA (polyA) in mRNA
rRNA doesn’t bind oligo dT and instead stays in solution
Why is single cell RNA-seq advantageous and what is it used for?
Bulk RNA-seq doesn’t discriminate between: different cell types, Mutations, Different cell cycle stage, Epigenetic modifications, Stochastic gene expression
Use single cell RNA-seq to analyse single cells in a heterogenous population