Lecture 5 - Computational analysis Flashcards

1
Q

It is now simple to measure the expression levels of thousands of genes

A

simultaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Methods such as RNA-seq allow for measurement of

A

transcriptome-wide expression levels without a reference genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

RNA-seq is useful for

A

high-throughput sequencing of RNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

RNA-seq allows for quantification of

A

gene expression and differential expression analyses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

RNA-seq allows for characterization of

A

alternative splicing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

de novo means

A

from the beginning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

de novo transcriptome assembly allows for

A

quantification and exploration of boutique organisms (no genome sequence necessary)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

RNA-seq steps

A
  1. Extraction of mRNA
  2. PCR amplification
  3. Sequencing (single or paired end)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Poly A selection is a method of

A

isolating Poly(A+) transcription usually using oligo-dT affinity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Ribodepletion depletes

A

ribosomal RNAs using sequence specific biotin-labeled probes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Reads

A

the sequenced portion of cDNA fragments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Coverage

A

read length, number of reads, or haploid genome length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Single-end

A

cDNA fragments are sequenced from only one end (1x100)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Paired-end

A

cDNA fragments are sequenced from both ends (2x100)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Strand-specific

A

You know whether the read originated from the + or - strand

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Counts =

A

(Xi) the number of reads that align to a particular feature i (gene, isoform, miRNA, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Library size =

A

(N) number of reads sequenced

18
Q

FPKM =

A

Fragments per kilobase of exon per million mapped reads

19
Q

CPM =

A

Counts per million mapped reads

20
Q

FDR =

A

False discovery rate (the rate of Type I errors - false positives)

21
Q

FASTA files are

A

text files with sequences (amino acids or nucleotides)

22
Q

FASTQ files are

A

text files containing header, sequence, and quality information

23
Q

A SAM file is a

A

tab-delimited text file that contains sequence alignment information

24
Q

BAM files are

A

the binary version (compressed and indexed version) of SAM files (they’re smaller)

25
Q

Compared to single-end RNA-seq, paired end gives

A

better alignment

26
Q

Paired end RNA-seq is essential for

A

splicing analyses and de novo assemblies

27
Q

Biological replicates are ______ while technical replicates are ______

A

necessary; not necessary

28
Q

Longer reads =

A

better alignments

29
Q

Implicit internal standards =

A

housekeeping genes

30
Q

Explicit external standards =

A

spike in RNA

31
Q

Technical replicates control for

A

variation in your procedure

32
Q

Biological replicates control for

A

variation such as growth or environmental effects

33
Q

Most gene expression experiments assume

A
  1. Most genes don’t change
  2. Only a few genes have significant changes in expression
34
Q

RNA and protein expression profiles _______ correlate well

A

do not always

35
Q

Sequence alignment is a way of

A

arranging sequences of DNA, RNA, or protein to identify regions of similarity

36
Q

Two types of sequence alignment

A
  1. local
  2. global
37
Q

NGS read alignment allows us to

A

determine where sequence fragments (reads) came from

38
Q

Differential expression analysis is

A

the assessment of differences in read counts of genes between two or more experimental conditions

39
Q

Gene Ontology (GO) Consortium seeks to

A

provide consistent descriptions of gene products across databases

40
Q

The GO is comprised of 3 structured ontologies that describe gene products in terms of associated

A
  1. Biological processes
  2. Cellular components
  3. Molecular functions
41
Q

Most commonly used databases for data deposition

A

Gene Expression Omnibus (GEO)
Short Read Archives (SRA)
dbGaP