HC 4.2 Omics and Gene Expression: Transcript Level Analysis Flashcards
hoorcollege 4
Transcription involves which omes?
genome and transcriptome
Types of genes
-Protein coding
-Non coding
Principal step of RNA seq is selecting RNA molecules. Which selections are possible?
-Size selection
-Type selection with Ribodepletion and poly(A)selection
The details of the RNA seq analysis depend on …
the experimental context and RNA molecules measured
Data analysis workflow for gene expression
-Selection
-Fragmentation and reverse transcription
-sequence and mapping
-quantitate
Goals mRNAseq
Taking complexity and analyse isoforms of genes (transcripts) or working with non-model organism with poorly characterized genome
4 options for mRNAseq analysis via transcriptome
- De novo assembly of transcriptome
- Well characterized genome: reference-based transcriptome assembly
- Combined reference based and de novo assembly
- model organism: download transcriptome from ENSEMBL or NCBI and use those for mapping
Principle of assembly
Reconstructing long sequences from overlapping sequence fragments.
Big challenge with de novo assembly
How to find the overlaps with millions of reads generated
How are De Bruijn graphs made?
-Sequence reads to k-mers of length k (nucleotide sequence from the reads)
-Order k-mers based on the overlaps > graph with arrows (de Bruijn)
Problem with de novo assembly and isoforms
Due to multiple k-mers with enough overlap for connecting to the previous one, multiple isoforms of assembled transcriptome are made, and the actual transcriptome is therefore not completely constructed.
How are long assembled sequences called?
Contigs
Purpose De Bruijn graph
Method to construct long sequences from short sequences
What is the result of an assembly?
A contig
Question with complexity of isoforms in gene expression quantitation
Is it a isoform or an assembly (next different piece