Methods for gene expression quantification and next generation sequencing Flashcards

1
Q

How does qPCR with a reporter probe work?

A

Quantitative real time polymerase chain reaction used to measure how much DNA is present
qPCR with a reporter Probe tests one gene at a time
Retro transcriptase converts mRNA of gene to cDNA which contains only coding sequences
Fluorescent probe binds to target cDNA if present
Probe contains fluorophore and quencher
Quencher stops fluorophore from releasing light
When taq polymerase starts to replicate DNA the probe is destroyed and the fluorophore is released
Fluorescence is no longer quenched and can be quantified
The number of copies after several rounds of replication are directly proportional to initial mRNA level
Measure fluorescence at every cycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is gene expression measured with qPCR (how are the results interpreted?

A

Measure fluorescence at every cycle, usually 40 PCR cycles
DNA in each sample is amplified through cycles of PCR so fluorescence increases
More DNA in sample = threshold passed sooner/after less cycles = higher mRNA content= higher gene expression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the threshold value tell us in qPCR results?

A

Threshold level = set above level of background fluorescence, intersects curve at start of exponential phase, based on amount of light produced in an unspecific reaction with no primers
Ct (threshold cycle) = spot where curve intersects threshold line. Shows number of cycles it took to detect a real signal from samples
Variation or expression difference is ratio between Ct values
In down syndrome, threshold is passed later than in control so contains less DNA
Look at image in notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How are qPCR results normalised?

A

Have to compensate for technical differences so results must be normalised
Control may have started with 2x more cells so would get 2x more RNA so would have to divide control measurements by 2

Housekeeping genes: essential, low mutation rate, highly/always expressed at the same level
Should be expressed the same amount in every cell
Ct values of HK genes in control and experiment are used to normalise qPCR
If in graph have one HK gene at level 4 and one at level 8, then started off with 2x DNA
2 levels of control: 1) count cells 2) measure amount of RNA (may differ due to inaccurate pipetting)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

limitations and advantages of qPCR

A

Advantage: qPCR is fast and cheap
Limitation: limited in how many genes can be tested a once. Can only check one gene at a time so only works for small samples (not genome wide)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the major advantage of microarrays?

A

Allows for detection of thousands of genes at the same time (unlike qPCR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do microarrays work? What is an Affy GeneChip?

A

Affy GeneChip is a very small chip containing 15,000 - 20,000 genes
Each gene has 15-20 pairs of probes synthesised on the chip
Probes = short, single stranded DNA sequences complementary to a specific gene
RNA is extracted and converted to cDNA by reverse transcription
cDNA is labelled with a fluorescent dye
cDNA hybridises with complementary DNA probes
Intensity of fluorescent signals is measured at each spot
Intensity = amount of cDNA bound = expression level of gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is expression on the AffyGeneChip measured with a computer?

A

Analyse chip with computer
Get a signal, detection and p-value
Signal: expression measurement for the corresponding probe
Detection: determines the absolute call for a measurement (A=absent, M=marginal, P=present)
Two slides are required, one slide used for the experiment and one for the control
Expression fold change calculated by comparing probe readings on both slides

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Limitations of microarrays

A

Data is very noisy
Expression levels are determined by a spot of light against a noisy background
Probes are not available for all genes - approx. 75-80% of all human genes
Genes with low expression may not be detected
Data required large degree of statistical manipulation
Result only shows a gene is expressed but gives no information about which transcript (can’t distinguish between diff isoforms of a protein)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does illumina sequencing work?

A

Based on DNA libraries
DNA is cut into fragments and adapters (short DNA fragments 25-50nt) ligated to ends of DNA
Attach DNA to flowcell
2 types of oligos are attached to the flow slide
Oligo is complementary to adaptor region on one of the fragments
Add library onto flow cell: adaptors anchor DNA molecule onto flow cell
Bridge amplification: Strand folds over, adaptor region hybridizes with the second type of oligo on the flow cell
Can have a sequence with one primer (single end sequencing = one reading per fragment) or paired sequence with two primers
Fragment becomes double stranded: PCR creates a complement of the fragment
Denaturation: Bridge is denatured resulting in 2 single stranded copies that are attached to the flow cell
Clonal amplification of all fragments: Process is repeated over and over again forming millions of clusters
Sequencing: Polymerize the complementary strand using fluorescently labelled nucleotides
Nucleotides are reversible terminators (only one can be added)
In each cycle, only one base can be added - to add another nucleotide, terminator has to be blocked
After addition of each nucleotide, clusters are exited with a laser, emitting a fluorescent signal
Each colour corresponds to a particular base that has been added

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does RNA-seq work?

A

RNA-seq uses NGS technology to measure gene expression
mRNA isolation
Illumina sequencing
Align sequences against reference genome (against each gene/exon)
Number of reads mapping to each gene is determined
Corresponds to gene expression level
If one sample has more RNA = get more reads so also need normalisation here

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Advantages of RNA-seq

A

Very accurate information on gene expression levels, even genes with low expression
Can identify exact transcript being expressed
Can get 1 million reads for one sequencing event
Can identify unknown transcripts with novel splice sites (discover different isoforms of a protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is RNA-Seq data normalised?

A

Compensate for technical differences with sequencing
Normalisation scales the read counts to they can be compared
Two main ways to normalise are: TPM (transcripts per million reads) and RPKM (reads per kilobase of transcript per million mapped reads)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is rRNA removed from samples during RNA-seq and why is it important?

A

90% of RNA in cell is ribosomal RNA (rRNA)
Need to eliminate rRNA because we only want RNA from genes
rRNA is not polyadenylated, mRNA is polyadenylated
Oligo dT probes bind to dA (polyA) in mRNA
rRNA doesn’t bind oligo dT and instead stays in solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is single cell RNA-seq advantageous and what is it used for?

A

Bulk RNA-seq doesn’t discriminate between: different cell types, Mutations, Different cell cycle stage, Epigenetic modifications, Stochastic gene expression
Use single cell RNA-seq to analyse single cells in a heterogenous population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain the process of single cell RNA-seq

A

Isolation of cells from sample
Individual cells are lysed to extract RNA content
cDNA synthesis
Library preparation: adapters added to cDNA and amplified with PCR
Unique molecular identifiers (UMI) - short nucleotide sequences attached to cDNA to allow for identification of unique molecules originating from the same transcript
Next gen sequencing
Cell type identification - cells cluster based on expression of genes in a specific location
Results illustrated by using heat maps or dimension reduction analysis tools like PCA or t-SNE

17
Q

What is ChiP-seq and what is it used for?

A

Chromatin immunoprecipitation sequencing
If a DNA binding protein is known (ex. a transcription factor) can be used to identify binding regions
ChIP-seq directly sequences TF-bound DNA which can be mapped back onto genome for precise localisation
Can identify histone modifications

18
Q

What is the mechanism of ChiP-Seq?

A

Cells treated with crosslinking agent (formaldehyde) to ‘freeze’ protein-DNA interactions/make permanent
Usually very short interaction between TF and enhancers that is hard for antibody to recognise
Cells are lysed into small fragments (100-500 bp) by sonication (ultrasound waves)
DNA fragments include those with target protein bound
Immunoprecipitation uses an antibody that specifically binds to protein of interest (TF)
Magnetic beads are attached to the antibody
Tube is attached to a magnet, attracts beads with antibody, protein (TF) and DNA binding region (enhancer)
Unbound DNA is removed by washing
Next generation sequencing
Mapped back to reference genome

19
Q

How are ChiP-Seq results analysed?

A

Analyse with a computer
Peaks indicate a region where the protein of interest binds (enhancer or histone modification)
Histone modification = broad peak
TF = sharp peak

20
Q

Limitations of ChiP-Seq

A

Requires large number of cells (10-20million)
Cross link and sonication introduces experimental variability, may degrade the protein
Relies on commercial antibody quality, varies from protein to protein

21
Q

What is the mechanism of CUT&RUN?

A

Cleavage under targets, release using nuclease
Cells are permeabilised and an antibody specific to protein of interest is introduced into the cell
pA-MNase fusion protein is added and cuts DNA left and right of the protein of interest (TF)
MNase only works in the presence of calcium
After cleavage, TF complex with antibody and MNase diffuses out of the cell
Purification
Make library
Sequencing

22
Q

What are the advantages of CUT&RUN?

A

Reduced background noise, higher resolution, simplified procedure that ChIP-seq
Requires less cells: 500k cells vs 10 million
No need to cross-link: reduces artefacts
No need to sonicate
Much cleaner signal: so less reads needed

23
Q

What is DNASE-seq and what is it used for?

A

Used to identify open chromatin regions
Cells are permeabilized
Cells are treated with DNase I that cleaves DNA at open chromatin regions (regions where chromatin is more accessible)
DNA fragments (corresponding to regions of open chromatin) are purified
Library is created: adding sequencing adapters
Sequencing
Peaks indicate a euchromatin region

24
Q

What are DNA hypersensitive sites (HS)?

A

Nucleosomal structure is less compacted - called euchromatin or open chromatin region
Are associated with genetic regulatory elements: promotors, enhancers, silencers, TF binding sites
Dnase I (endonuclease) preferentially cleaves DNA in regions where chromatin is more accessible

25
Q

What is ATAC-seq and why is it advantageous to DNA-seq?

A

Assay for transposase-accessible chromatin using sequencing
Alternative to DNase-seq
Uses mutated hyperactive transposase Tn5 instead of DNase
Tn5: transposable element that cuts DNA + ligates adapters
DNA fragments are isolated, sequenced, then mapped
Advantages: requires smaller sample size (1000 fold less than DNase-seq), very fast (less than 3 hours)
Can identify which TF binds by detecting motifs in the DNA

26
Q

What is MNase-seq and what is it used for?

A

MNase: enzyme that preferentially cleaves DNA in regions not protected by nucleosomes (‘naked’ DNA)
After digestion, only nucleosomes are left
Used to map nucleosome positioning
Need calcium for MNase activity

27
Q

What is RIP-Seq and what is it used for?

A

Used to study RNA-protein interactions
To identify RNA molecules that interact with RNA-binding proteins (RBPs)
Cells are cross-linked using formaldehyde to preserve RNA-protein interactions
Cells are lysed
Immunoprecipitation using antibody that binds to RBP and associated RNA molecules
RNase treatment to digest RNA not protected by bound protein
Sequencing
Low stringency = low specificity

28
Q

What is CLIP-Seq and what is it used for?

A

Used to study RNA-protein interactions
To identify RNA molecules that interact with RNA-binding proteins (RBPs)
Cells are cross-linked using UV light to covalently link RNA and proteins
Cells are lysed
Immunoprecipitation using antibody that binds to RBP and associated RNA molecules
RNase treatment to digest RNA not protected by bound protein
Adapter ligation
Reverse transcription of RNA into cDNA
PCR and sequencing
Cross-linking introduced mutation sites: introduced from from UV light during crosslinking
cDNA and mutations are mapped back to reference genome: allows identification of RBP-RNA binding sites

29
Q

What is chromatin interaction and what techniques are available to measure it?

A

Long range interactions of regulatory elements, such as promotors, enhancers, TAD boundaries
Important for gene regulation in health and disease
ChIA-PET, 3C, 4C, HiSEQ
Techniques based on probability: usually enhancers are 10kD away from promotor/gene
Long experiment, doesn’t always work

30
Q

What is ChIA-PET?

A

ChIP (chromatin immunoprecipitation) using antibody specific to protein of interest (TF or histone motif)
Proximity ligation: ligating DNA fragments that are in close proximity
Ligated DNA is fragmented with restriction enzymes
Generates paired-end tags (PETs) representing interacting chromatin fragments
PCR and sequencing

31
Q

What did the ENCODE pilot project do?

A

ENCODE (Encyclopaedia of DNA elements): project to identify functional elements in human genome sequence (enhancers, histone modifications, TF binding sites)
Used combination of many techniques