Cancer Transcriptomics Flashcards
What is the transcriptome informing of
- Analysis of the entire collection of RNA sequences in a cell
- > which genes are turned on - Different cells show different patterns of gene expression- Liver genes are expressed specifically in the liver
- Transcriptome actively changes
- By collecting and comparing transcriptomes of different types of cells, we can a deeper understanding of what constitutes a “normal” cell function; what changes occur in a diseased cell, etc.
- Some molecular features can only be observed at the mRNA level– Alternative isoforms, fusion transcripts, RNA editing
The amount of RNA molecules is estimated at what?
roughly 10 to power of 7 per cell.
What are methods to quantify mRNA expression levels?
- PCR methods
- RT-PCR. qPCR, etc.
- Quantification of single gene - RNA in situ
- RNAscope
- Quantification and localisation of single gene - Probe-based
Relative (microarray) or absolute (nanoString) of pre-selected genes - RNA-sequencing
Absolute quantification of all types of RNA species
What are the different RNA preparation methods?
1. Total RNA: Broad transcript representation Abundant RNAs dominate High unprocessed RNA High genomic DNA
2. rRNA reduction Broad transcript representation Abundant RNAs de emphasised High unprocessed RNA High genomic DNA
3. cDNA capture Limited transcript representation (targeted) Abundant RNAs de emphasised Low unprocessed RNA Low genomic DNA
4. PolyA selection Limited transcript representation (polyA) Abundant RNAs de emphasised Low unprocessed RNA Low genomic DNA
How can sequencing depth limit research questions?
Dependent on capture method
Absolute minimum of 10 million reads
- Variation in genes with above-median expression stabilises at about 10 million reads per sample among technical replicates (Wang et al. 2011)
40 to 60 million reads
- Can identify alternative splicing with high confidence
> 100 million reads
Quantify low-abundant transcripts
Identify fusion genes
FFPE material
To attain high quality RNA data 150 million reads per sample recommended
What is the goal of RNA-sequencing experiments?
- Compare expression of genes between different samples
- Compare expression of genes with other genes within the same sample
What is needed to achieve the goal of RNA sequencing experiments?
- Quantification
- ’Counting’ the number of reads mapped to a gene
- Technical problems with simply ‘counting’ (technical biases)
- Between samples- samples with higher reads have higher counts
- Across genes- longer genes have higher counts
2. Normalisation
Process of removing variation
To address technical biases, what needs to be performed across samples and across features?
Normalisation
What variety of normalisation methods exist?
- RPKM - reads per kilobase of transcript per million reads of library
Corrects for total library coverage
Corrects for gene length
Comparable between different genes within the same dataset - FPKM - fragments per kilobase of transcripts per million reads per library
Only relevant for paired end libraries
Read- pairs are not independent - TPM - transcripts per million
Normalised to transcript copies instead of reads
Corrects for cases where the average transcript length differs between samples
Compare samples of different origin
What are the challenges in RNA-sequencing?
- Sample purity, quantity and quality
RNA is fragile compared to DNA (easily degraded) - Mapping strategies
Small exons may be separated by large introns
Aligning RNA-sequencing reads to genome is challenging - Relative abundance of RNA species vary wildly (between 105 to 107)
Since RNA-sequencing works by random sampling, a small fraction of highly expressed genes may consume the majority of reads
Ribosomal and mitochondrial genes - RNA species come in a wide range of sizes
Small RNAs must be captured separately
PolyA selection of large RNAs may result in 3’ end bias
What is used in the downstream analysis of single‐cell RNA‐seq?
t-distributed stochastic neighbour embedding (t-SNE) plots
a dimensionality reduction step for visualising the data in two dimensions
What is transcript-omics?
Investigation of gene expression patterns based on the relative
amount of mRNA under a given condition
What does transcript-omics in translational science involve?
Compare normal tissue with diseased tissue
Classification of different tissue types or cellular populations
Gene expression pattern related to specific clinical characteristics
Long list of gene signatures capturing different phenotypes, responses to drugs, etc.
What does a single sample predictor involve?
Extract RNA and analyse this and investigate expression level. Can align expression pattern of known subgroups from previous research. If similar expression seen as previous research, can predict patient survival
What is the Nanostring Prosigna?
What is the Nanostring Prosigna used for?
FDA cleared assay for subtyping breast cancer
Predict patient’s Risk of Recurrence
- estimates the probability of distant
recurrence over 10 years
- PAM50, intrinsic subtype, nodal status, tumour size and proliferation score. Intrinsic subtypes provide valuable prognostic information to guide clinical decisions