single cell transcriptomics Flashcards
1
Q
heterogeneity in cell populations
A
- cell types
- somatic mutations
- cell cycle stage
- epigenetic modifications
- stochastic gene expression
2
Q
limitations of bulk assays
A
- assuming homogeneous relationships can lead you to the wrong conclusion
- rare cell types can become lost
- can’t see real time changes
- need to order by differentiation progress, not time
3
Q
process of SCT
A
- isoolate cells
- lyse
- reverse transcribe and amplify cDNA
- qPCR or RNAseq
- up to 10,000s of genes in 10,000s of cells
4
Q
methods of single cell isolation
A
- low throughput:
- manual/automated micropippetting
- cytoplasmic aspiration
- high throughput:
- FACS
- microfluidics
5
Q
qPCR
A
- quantitative/real-time PCR
- gene specific PCR primers
- include housekeeping genes (GADPH)
- fluorescent dye to detect PCR product
- measure Ct value for each gene
- threshold cycle number
- normalise data
6
Q
qPCR normalisation
A
- higher Ct means less cDNA
- arbitrary maximum Ct value
- calculate ΔCt for each gene
- max - gene
- higher Δ means more cDNA
- normalise with hk genes
- assume hk expression constant
- calculate gene ΔCt - hk ΔCt
- doubling cycles so subtraction not division
7
Q
RNAseq
A
- sequence cDNA library
- map reads to reference
- count read number for each gene
- need quality control
- can have coverage bias (5’/3’) in some protocols
8
Q
technical dropouts
A
- zero counts
- common
- when some mRNA not captured during reverse transcription
- capture efficiency:
- % of mRNA molecules in cell lysate detected
- often 10-20%
- more frequent in low expression genes
- varies between cells
9
Q
RNAseq normalisation
A
- convert raw read counts into expression levels per cell
- correct for cell to cell variation
- in capture, amplification, sequencing efficiency
- method depends on protocol used
- spike in or UMI
10
Q
extrinsic spike-ins
A
- add RNA of known sequence and quantity to lysate
- internal control
- equal quantity in each lysate
- normalise counts by number of reads mapped by spike in RNA
- assumes same capture, amplification and sequencing efficiencies
- be cautious:
- no 5’ cap or polyA tail
11
Q
UMIs
A
- unique molecular identifiers
- barcode on each cDNA moelcule
- 6-10 nt added before amplification
- track how much of amplified DNA comes form original molecule
- count number of unique UMIs associated with each gene
- assume library sequenced to saturation
- corrects for variation in amplification efficiency but not other sources
- e.g. reverse transcription
12
Q
normalisation without spike in or UMI
A
- same as used by bulk RNAseq data
- assume hk gene expression or total mRNA content the same
- normalise read counts by hk expression/total mRNA
- cna also combine techniques
13
Q
SC data analysis techniques
A
- clustering
- dimensionality reduction
- differential expression
- pseudotemporal ordering
- network interference
14
Q
single cell clustering
A
- cluster by trancriptomic profile to:
- analyse sub-population structure
- identify cell sub-types/rare cell types
- cluster by cell expression states to:
- identify co-varying genes
15
Q
SC clustering methods
A
- partitional
- produces disjoint groups
- k-means
- hierarchical clustering
- divisive or agglomerative
- hierarchical tree
- can provide more information