Lecture 10 (Linneweber) Flashcards
RNA Content in eukaryotic cells & Challenges of RNA composition in a cell?
RNA Content in eukaryotic cells
- Cells contain 10–30 picograms of RNA, including 300,000–400,000 mRNA transcripts.
- There are 10,000–20,000 mRNA types, varying by cell type.
- RNA is more abundant than DNA (6.6 picograms) in cells.
- 4% of total cellular RNA are mRNAs
- in mass rRNA is dominant
- tRNA are abundant but not that much mass
- Extracting mRNA is challenging due to its low abundance, as rRNA dominates RNA mass (RNA-seq)
mRNA Processing
1: Capping: Definition & Function
mRNA Capping
- Addition of a 7-methylguanosine cap to the 5’ end of mRNA.
- Connected via a 5’-to-5’ triplephosphate bond.
- Ribose near the cap are methylated at the 2’-OH position.
Function
- Protects mRNA from degradation by exonucleases.
- Facilitates recognition (mark) by cap-binding proteins for processing and translation.
- Efficient export out of the nucleus
- Increases the efficiency of translation
mRNA Processing
2A: Polyadenylation: Definition & Function
Polyadenylation
- Addition of a poly(A)-tail (a chain of adenine nucleotides) to the 3’ end of mRNA.
- Occurs after transcription, aided by poly(A)-polymerase at certain highly conserved recognition motifs in the mRNA consisting of a Signal: AAUAAA, Spacer, Site: CA
- PolyA-tails affect mostly the length of 3’-UTRs
Functions
- Protects mRNA from degradation by exonucleases.
- Enhances stability and translation efficiency (20-fold) of mRNA.
- Plays a role in mRNA transport from the nucleus to the cytoplasm.
mRNA Processing
2B: Polyadenylation: Mechanism
Polyadenylation
1. The C-terminal domain (CTD) of RNA Polymerase II recruits endonuclease and polyadenylation factors to the polyadenylation site after producing an RNA containing the required sequence elements (e.g., AAUAAA)
2. The sequence AAUAAA near the 3’ end of pre-mRNA is recognized & cleaved at CA Position by cleavage and polyadenylation specificity factor (CPSF) & Cleavage Stimulation Factor (CstF)
3. Poly(A) polymerase (PAP) adds 50–250 adenine nucleotides to the cleaved 3’ end. (A because ATP Conentration is in abundance vs UTP, CTP…)
4.. Binding of poly(A)-binding proteins (PABP II) protects the tail against degradation, serves as a recognition factor and enhances translation
mRNA Processing
2C: Alternative Polyadenylation
Alternative Polyadenylation (APA)
- Serves as a regulatory mechanism in gene expression.
- Polyadenylation can occur at different sites: downstream, upstream, or intermediate sites on the same pre-mRNA.
Frequency and Site of polyadenylation depends on:
- Signal sequence quality: Stronger sequences are more likely to be recognized by Poly A factors.
- Regulatory factors: Proteins can enhance or block specific polyadenylation sites, creating competition among sites.
How do viruses use CAP and polyA tails?
Polio
How do viruses use CAP and polyA tails?
- Cap: Polioviruses bypass the need for a 5’ cap by using an internal ribosome entry site (IRES) to initiate translation directly.
- Poly(A) tail: The virus’s RNA contains a poly(A) tail, mimicking host mRNA to enhance stability and translation efficiency.
- Protease expression: The virus produces a protease that specifically degrades PABP (Poly(A)-binding protein) and eIF4E, disrupting host mRNA translation while prioritizing viral RNA.
Regulation of gene activity (eukaryotes)
Possible Regulations
Regulation of gene activity
- Short-term Reaction to changed conditions)
- Long-term (cell differentiation)
- Positive (activating)
- Negative (repressive)
- at various substeps of gene expression (not only transcriptionally)
- Define the Output/ Product by changing rates of (e.g. splicing, Degradation, Translation), some steps might be limiting
RNA Splicing
Definition
RNA Splicing Definition
RNA splicing is the mechanism by which two segments of RNA,
either intermolecular (trans) or intramolecular (cis) are joined
requiring the removal of intervening sequence
Nobel Prize Experiment for the Discovery of Introns
Experiment for the Discovery of Introns
- DNA viruses that produce RNAs with introns were used.
- After incubation, spliced RNA was produced.
- Reverse transcription generated complementary DNA (cDNA).
- Electron Microscopy (EM) visualized hybridized pre-mRNA and cDNA.
- Looped-out regions observed, indicating introns (non-coding sequences removed during splicing).
Classes of Introns
Classes of Introns
- Group I & Group II (self-splicing)
- Spliceosomal Introns (Protein-dependent, but also ribozymes, derived from Group II)
Group II Introns
Structure
Group II Intron Structure
- Found in Organelles (Mitochondria or Chloroplast) & Bacteria
- Form a common secondary structure (six-domain stem-loop, with Domain 6 including the branch point adenosine for splicing)
- Structure is nessecary to bring the reaction site close together
- The positioning of two metal ions to each other in the catalytic center is essential for stabalizing the intermediate Lariat state
- Self splicing only in vitro; in vivo helping proteins (maturases) that allow correct folding are required for splicing
Group II Introns
Splicing Mechanism
Splicing Mechanism
1. First Transesterification: The 2’-OH of Adenosine in Domain 6 (Intron) attacks the 5’ splice site, forming a lariat structure with the intron.
2. Lariat Intermediate: The intron forms a loop (lariat) structure, leaving a free 3’-OH on Exon 1 (Formation of a new esterbond 2’-5’).
3. Second Transesterification: The free 3’-OH of Exon 1 attacks the 3’ splice site, joining Exons 1 and 2 together.
4. The exons are ligated to form the mature RNA, and the intron is released as a lariat structure.
RNA-Chaperones
Function
RNA Chaperones
- Most RNAs can adopt multiple alternative folding states depending on temperature (e.g., Watson-Crick base pairs).
- While RNA typically self-folds into a structure, incorrect folding can result in local energy minima, creating “energy traps.”
- The most stable energy minima is usally the correct folding state
- Note: These local energy minima are competing with each other
RNA chaperones assist to reach the correct folding state by either:
- Type 1: Actively folding and stabilizing the correct structures
- Type 2: Resolving misfolded RNAs by unfolding incorrect structures for refolding.
Type 1 RNA Chaperones: Trapping
the right structure
Mechanism
Mechanism
- The Intron rapidly folds into various intermediate energy states.
- Chaperones interact with the RNA, trapping it as soon as it reaches specific “correct” energy minima.
Type 2 RNA Chaperones: dissolving
misfolded RNAs
Mechanism
Mechanism
- CYT-19 has low affinity for native RNA structures but high affinity for misfolded structures
- It utilizes ATP hydrolysis (helicase activity) to unfold the misfolded RNA (breaks base-pairs)
- This process allows the RNA to refold, providing another opportunity to achieve the correct structure
Spliceosomal Introns
Exon Recognition
Spliceosomal Introns
- Average size of a human exon is 150 nucleotides
- Introns average 3,500 nucleotides (up to 500,000 nucleotides)
How are intron–exon boundaries accurately defined?
- Recognition of intronic sequences (define Intron)
- Recognition of exonic splicing enhancer
- Coupling of transcription and splicing to avoid exon
skipping (avoids missing an exon)
Spliceosomal Introns
1: Recognition of intronic sequences
Recognition of intronic sequences
- 3 important sequence elements: 5’, 3’, Branch site
- Conserved GU at 5’ Splice Site (GT in DNA)
- Conserved AG at 3’ Splice Site
- U & C stretch upstream of 3’ Splice Site
- further upstream buldged A (branch site)
- 15% of all inherited diseases are caused by point mutations that disrupt splice sites.
Spliceosomal Introns
2: Spliceosome Complex Structure
Spliceosome
- Large assembly of 5 snRNAs (small nuclear RNA) & over 150 different proteins: Proteins associated with snRNAs called snRNPs (small nuclear ribo nucleo protein particles) (U1-U6) are the key trans-acting factors & Splicing factors & other proteins
- Catalytic center is made of RNA (U6 coordinates M atoms –> identical to Group II introns)
- Total size is 50-60 S
Evolution of the Spliceosome
Evolution of the Spliceosome
- Catalytic center is made of RNA (U6 coordinates M atoms –> identical to Group II introns)
- The Domains of the Group 2 Introns became independent as U RNAs and are now capable of assembling on Introns and helping to splice –> provides machinery in trans (for other RNAs)
- Was found out through deleting U6 and add Domain 5 (from Group 2 Intron) –> Complementation –> Functional Conservation
Spliceosomal Introns
2: Spliceosome Mechanism
Mechanism
- Different snRNPs interact (U1&U2) to enable the branch point’s 2’-OH to attack the 5’ splice site.
- U1 snRNP recognizes & hybridizes with the 5’ splice site, while U2 snRNP hybridizes with the branch point A.
- Stronger hybridization interactions improve recognition accuracy and splicing efficiency.
- CTD of RNA Pol II helps identify the exons because they emerge in an order (5’ splice site, Branch point, 3’ splice site) –> Exon recognition is supported by Co-transcriptional splicing
- The splicing cycle is driven by dynamic RNA-RNA interactions, which are very energy-demanding to break up after usage (through RNA helicases)
RNA Seq (Sequencing Transcripts from Tissues or Single Cells)
Key Concept, Purpose & Core Metrics
RNA-Seq
Key Concepts
- Idea is to sequence all (or as many as possible) of the transcripts within a tissue or a single cell
- DNA intermediate is usally used
- Single-End Sequencing: DNA fragments sequenced from one end.
- Paired-End Sequencing: DNA fragments sequenced from both ends; used for whole-genome sequencing.
- Fragment Coverage: Typically, only fragment ends are sequenced.
- Usally mRNA gets sequenced
Purposes
- Whole genome sequencing
- Studying changes in gene expression
- Metagenomic studies from enviromental samples (e.g. Microorganisms)
Core Metrics
- Mapping: Positioning of the individual reads relative to the reference sequence –> can derive the reads per base
- Sequence Depth: Number and distribution of reads per base
- Coverage: How well the reference sequence is covered (average reads per base)
RNA Seq (Sequencing Transcripts from Tissues or Single Cells)
mRNA Enrichment Methods, Workflow & Variants
mRNA Enrichment Methods
- Poly-A Enrichment: Enrich mRNA using an Poly-T-Oligo-nucleotide Bead that pairs with poly-A tails
- rRNA Depletion: Remove rRNA or tRNA by degradation or size exclusion
RNA-Seq Workflow
1. RNA gets fragmented chemically (around 150 nt) and primed with random oligos.
2. RNA gets reverse transcribed into cDNA.
3. Defined sequences (Adapters) get attached to cDNA.
4. Hybridize primers to the Adapters
4. Bridge PCR amplifies library.
5. Selection of fragment sizes (optional).
6. Short reads sequenced (e.g., Illumina).
7. Bioinformatics Analysis: Align short reads to exon sequences for analysis.
Variants
- Long-Read Sequencing: Reverse-transcribe full-length mRNAs directly for sequencing without PCR amplification (not all DNA fragments amplify with equal efficiency)
- Direct RNA Sequencing: Sequence RNA directly using RNA adapters (e.g., Nanopore sequencing).
Ilumina - Solexa Sequencing
Overview, Advantages & Limitations
Illumina Sequencing
Overview:
- High-throughput sequencing technology for DNA and RNA analysis.
- Generates short reads with high accuracy.
Advantages
- High accuracy and scalability
- Suitable for genome, exome, and transcriptome sequencing
Limitations:
- Limited to short reads (~150–300 bp).
- Challenging for repetitive regions or complex structural variations.
Ilumina - Solexa Sequencing
Workflow
Illumina Sequencing Workflow
1) Sample Preparation
- DNA is fragmented and 2 different type of adapters with known sequences are ligated.
- DNA is denatured and attached to a flow cell (coated with primers that fit adapters)
2) Cluster Generation (Bridge PCR)
- Add nucleotides and allow hybridization of DNA Fragments to the primers (with blockage tag to prevent further elongation) –> forms Bridge structures
- Removal of all the sense or of all reverse strands (need identical molecules)
- Amplified through bridge PCR, forming dense clusters of identical sequence
3) Sequencing by Synthesis
- Incorporate fluorescently labeled nucleotides.
- Fluorescence detected after each nucleotide addition.
- Laser excites the label, and a camera records the signal of each cluster.
4) Data Analysis
- Reads are aligned to a reference genome.
- Variants, coverage, and expression levels analyzed.