1. Extraction of mRNA 2. PCR amplification 3. Sequencing (single or paired end)

Lecture 5 - Computational analysis Flashcards by Grant Meyer

It is now simple to measure the expression levels of thousands of genes

simultaneously

How well did you know this?

Not at all

Perfectly

Methods such as RNA-seq allow for measurement of

transcriptome-wide expression levels without a reference genome

How well did you know this?

Not at all

Perfectly

RNA-seq is useful for

high-throughput sequencing of RNA

How well did you know this?

Not at all

Perfectly

RNA-seq allows for quantification of

gene expression and differential expression analyses

How well did you know this?

Not at all

Perfectly

RNA-seq allows for characterization of

alternative splicing

How well did you know this?

Not at all

Perfectly

de novo means

from the beginning

How well did you know this?

Not at all

Perfectly

de novo transcriptome assembly allows for

quantification and exploration of boutique organisms (no genome sequence necessary)

How well did you know this?

Not at all

Perfectly

RNA-seq steps

Extraction of mRNA
PCR amplification
Sequencing (single or paired end)

How well did you know this?

Not at all

Perfectly

Poly A selection is a method of

isolating Poly(A+) transcription usually using oligo-dT affinity

How well did you know this?

Not at all

Perfectly

Ribodepletion depletes

ribosomal RNAs using sequence specific biotin-labeled probes

How well did you know this?

Not at all

Perfectly

Reads

the sequenced portion of cDNA fragments

How well did you know this?

Not at all

Perfectly

Coverage

read length, number of reads, or haploid genome length

How well did you know this?

Not at all

Perfectly

Single-end

cDNA fragments are sequenced from only one end (1x100)

How well did you know this?

Not at all

Perfectly

Paired-end

cDNA fragments are sequenced from both ends (2x100)

How well did you know this?

Not at all

Perfectly

Strand-specific

You know whether the read originated from the + or - strand

How well did you know this?

Not at all

Perfectly

Counts =

(Xi) the number of reads that align to a particular feature i (gene, isoform, miRNA, etc.)

How well did you know this?

Not at all

Perfectly

Library size =

Study These Flashcards

(N) number of reads sequenced

FPKM =

Study These Flashcards

Fragments per kilobase of exon per million mapped reads

CPM =

Study These Flashcards

Counts per million mapped reads

FDR =

Study These Flashcards

False discovery rate (the rate of Type I errors - false positives)

FASTA files are

Study These Flashcards

text files with sequences (amino acids or nucleotides)

FASTQ files are

Study These Flashcards

text files containing header, sequence, and quality information

A SAM file is a

Study These Flashcards

tab-delimited text file that contains sequence alignment information

BAM files are

Study These Flashcards

the binary version (compressed and indexed version) of SAM files (they’re smaller)

Compared to single-end RNA-seq, paired end gives

better alignment

Paired end RNA-seq is essential for

splicing analyses and de novo assemblies

Biological replicates are ______ while technical replicates are ______

necessary; not necessary

Longer reads =

better alignments

Implicit internal standards =

housekeeping genes

Explicit external standards =

spike in RNA

Technical replicates control for

variation in your procedure

Biological replicates control for

variation such as growth or environmental effects

Most gene expression experiments assume

1. Most genes don't change 2. Only a few genes have significant changes in expression

RNA and protein expression profiles _______ correlate well

do not always

Sequence alignment is a way of

arranging sequences of DNA, RNA, or protein to identify regions of similarity

Two types of sequence alignment

1. local 2. global

NGS read alignment allows us to

determine where sequence fragments (reads) came from

Differential expression analysis is

the assessment of differences in read counts of genes between two or more experimental conditions

Gene Ontology (GO) Consortium seeks to

provide consistent descriptions of gene products across databases

The GO is comprised of 3 structured ontologies that describe gene products in terms of associated

1. Biological processes 2. Cellular components 3. Molecular functions

Most commonly used databases for data deposition

Gene Expression Omnibus (GEO) Short Read Archives (SRA) dbGaP

Lecture 5 - Computational analysis Flashcards

(41 cards)