Transcriptomics Flashcards

1
Q

What is the transcriptome?

A

complete set of transcripts (= mRNAs) and their relative levels of expression in a biological entity

It translates genotype to phenotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is transcriptomics good to use as a proxy for protein measurements?

A

amount of mRNA is easy to measure
usually pos correlated with protein amount (but not always and there is a time lag)
Easier to measure than protein levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what may be the cause of diversity between humans and chimps

A

even though 99% genetically similar, there are dramatic phenotypic differences.
Gene regulation must be the source of diversity.
Mary Claire King and Allen Wilson studies protein similarity between chimp and humans. very high degree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why must we not rely on just transcriptomics?

A

It is hypothesis generting rather than evidence generating.

Many RNAs are not protein encoding eg MiRNAs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are cis/trans regulatory elements?

A

cis-regulatory elements are present on the same molecule of DNA/close to promoter of the gene they regulate whereas trans-regulatory elements can regulate genes distant from the gene from which they were transcribed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are 2 examples of morphological variation due to cis regulation?

A

Drosophila wing spots - black spot due to extra TF binding site
Sticklebacks - marine: have pelvic spines, whereas FW dont. FW gene present but not activated, change in TF binding site.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

order of events in transcription

A

DNA -> pre-mRNA.
Capping, polyadenylation
Splicing - intron excision and exon joining = mRNA
Transported into cytoplasm for translation by ribosomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how many genes are alternatively spliced in humans and drosophila?

A

40-75%

human - estimated each gene had 3-8 differnt transcripts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sex determination through alternate splicing in drosphila

A

M and F have different Sexlethal expression (due to females having XX chr, so make homodimer Sisterless protein, whereas males have 1X chr so make heterodimer sisterless. Sisterless is TF of Sxl.
F Sxl protein is TF of Tea which represses ‘poison exon’ in doubles gene containing stop codon in its own mRNA, so expressed full length protein.
M Sxl poison exon is spliced into mRNA and truncated Sxl protein produced.

Double sex gene - Tra2 and Tra complex binds to repeat sequences in F specific exon of doublesex gene, leading to M and F speciific splice forms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

10 Basic steps in RNA seq

A
  1. get mRNA or total RNA in sample
  2. remove contaminant DNA, select mRNA using the poly T beads to grab polyA tail.
  3. Remove rRNA - most highly expressed RNA in cell but least informative.
  4. Fragment RNA
  5. Reverse transcribe into cDNA
  6. Strand specific RNA seq
  7. Ligate sequence adaptors (of known sequences)
  8. PCR amplification
  9. Select a range of sizes
  10. Sequence cDNA ends
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an issue in data from transcriptome analysis

A

massive batch effects
investigated by Gilad and Mizrahiman 2015
Found more differences between species in the same organ transcriptome than in the same individuals different organs. PCA analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What 4 things to consider in RNA seq experimental design

A

technical replicates unnecessary, as illumina has low technical variation unline microarrays.
To minimise batche ffects, do everything together at the same time.
Biological replicates are essential - 3+ from independent batches.
For alignment to reference genome, use splice aware aligner for isoform specific RNA seq

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a splice aware aligner?

A

Aligner that matches RNA sequences to reference genome but leaves out introns. makes gaps in the RNA sequence and looks downstream to continue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

2 approaches of transcriptome assembly

A

De novo - Assemble then align RNA seq reads. scaffold the contigs then extend them with unassembled reads.
Reference based - Align then assemble. use de novo assemble for any unaligned reads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

examples of splice aware aligners

A

TopHat2, MapSplice, SOAPSplice, Passion, SpliceMap, RUM, ABMapper, CRAC, GSNAP, HMMSplicer, Olego, BLAT, HISAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

a program used for data quality control

A

fastq

17
Q

basic workflow for differential gene expression with and without a reference genome and transcriptome.

A
  1. Experimental design
  2. sequencing
  3. data quality control
  4. Read mapping (use reference genome) (if no reference genome, do transcriptome assembly).
  5. Differential Expression Analysis (use Reference transcriptome if available. if not, transcriptome assembly)
18
Q

why must reads be normalised before differential gene expression analysis?

A

samples get different sequencing depths (more /less reads).

19
Q

How to normalize reads for DGE analysis

A

RPKM - reads per kilobase per million.

reads x gene length (kbp) x million in genome

20
Q

what distribution does RNA seq usually follow?

A

negative bi nomial

21
Q

what sorts of higher level analysis is done with RNA seq data?

A
  • search for biological meaning
  • assign biological functions using homology info
  • map genes to pathways
  • group genes by molecular unction, bio processes , cellular component or pathway
22
Q

GO

A

gene othology
gives a bin which the gene falls into.
suggetss the function from a hierarchy of functions

23
Q

how to search gene function?

A

Entrez gene - database, but searching 1 gene by 1 is long

so use GO database instead

24
Q

3 major categories of GO structure

A
  1. Molecular function
    eg TF, RNA binding
  2. Bio processes
    Broad goals with a group of moleular functions, eg transcription, preRNA splicing.
  3. cellular component
    subcellular location, macromolecular complexes
25
Q

what is KEGG

A

Kyoto Encyclopedia of Genes and Genomes

pathway maps show up and down regulation of genes

26
Q

Why is enrichment analysis good?

A

can select genes which have altered expression to analyse, rather than a massive list of all genes in a transcript.