Transcriptome Analysis Flashcards

1
Q

Why do we care about transcription?

A

It is the primary means of interpreting info in the genome
it plays a central role in evolution
Often misrelated in disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Complex traits

A
  • > 85% of GWAS associations lie in non-coding regions

- enriched for eQTLs, overlap with regulatory elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Basic principles of gene regulation

A

Gene expression varies in quantity, space, time, and in response to stimuli
We typically measure steady state RNA
RNA is regulated at the level of transcription, promoter usage, splicing, poly A site usage, stability, and localization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Perspectives to study RNA

A

spatial localization
abundance quantification
transcript isoforms and structure
emphasis on response to stimuli

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Spatial localization of RNA

A

Techniques: In situ hybridization, immuno histochemistry, gene fusions

Can provide very precise (sub)cellular resolution

Often on fixed tissues, but live imaging becoming more common

often difficult to quantify due to technical variations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

immuno histochemistry

A

treat tissue with antibodies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Quantifying RNA abundance

A

Technqiues: Northern blots, qPCR, microarrays, nano string, RNAseq
Isolate cells, extract RNA, measure steady state RNA
Isolating cells can be difficult

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

transcript isoform usage and structure

A

techniques: qPCR, nanostring, Long read or paired end RNA seq
microarrays were not particularly good for this
short read RNAseq data has inherent limitations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Response to stimulus

A

Peturb system, measure gene expression

-knock down TF + measure RNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Knock down TF and measure RNA

how do you know if change is direct?

A

pulse chase experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

method based on pulse chase experiment

A

nascent transcription quantification (GROseq)

measuring RNA stability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

EST Sanger sequencing

A

which is great for gene identification and characterization, long reads enabled isoform reconstruction, too expensive to accurate quantification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SAGE

A

Serial Analysis of Gene Expression

cDNAs cleaved into short <20bp fragments, concatamerized, and sequenced

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

RNAseq molecular biology

A
extract RNA
purify RNAs of interest (mRNA, miRNA)
fragment, prime
convert to cDNA
attach adapters
sequence single or paired end reads
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

RNAseq analysis outline

A

In some applications, reads are aligned to transcriptome (some align to transcriptome)
Assemble and quantify transcript abundance
test for differential expression(data are count based)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

RNAseq complications

A

Alignment –>short reads, large gaps, 1% error

  • using annoyed gene models helps
  • paired end and longer reads help

Experimental design - replicates –>bc the experiment is fairly expensive and complicated, many people do not perform (enough) replicates

Confounding variables:
-randomization is critical in experimental design

small n, large p
empirical bayes approaches

17
Q

Computation cost

A

Tophat ~1 hr/ 1 M reads on standard workstation

18
Q

Confounding variables:

A

difficult to control for variables can have large effects on RNAseq data
RNA extraction data, person performing library construction, kit batch, sequencing run, temperature, time day…

19
Q

hidden variables

A

latent variable techniques: PCA, factor analysis, PEER, SVA

20
Q

FPKM

A

Fragments per kilobase per million mapped reads
standard unit of measurement for RNA abundance from RNAseq
normalizes by transcript length and read depth

relative measure- depends upon the abundance of all transcripts,

21
Q

Surrogate variable analysis

A

ranking features of association accounting for hidden variables that are unmeasured