Gene Structure Flashcards
What is alternative splicing?
Alternative splicing generates different proteins from one gene
Genes contain multiple exons
Changing which exons are found in the final RNA changes the function of the protein
What is the mechanism of splicing?
Doesn’t require ATP
Can splice over large distances (10s of kilobases)
Pre-mRNA contains introns and exons
OH group from branch point adenine attacks phosphyl bridge between last base of exon 1 and first base of intron
Lariat structure formation
OH from last base of exon 1 attacks bridge between exon 2 and lariat
Formation of spliced mRNA and lariat structure
Look at image
What are the spliceosome and spliceosome co-factors responsible for?
Bringing splice sites into close proximity
Using the correct splice site
Exon skipping
Avoiding cryptic splice sites
What is a cryptic splice site?
Selection of alternative splice sites: tissue and developmental-stage specific
Splice site selection must be tightly regulated
Many genetic diseases can be caused by point mutations that activate cryptic splice sites or delete splice sites
Cryptic splice sites can be useful when pre-existing and damaging when introduced by a mutation
Changes frame of protein so rest of protein sequence is missense
Missing important part of coding sequence so protein looses its function
What is the structure of the spliceosome?
Made of 5 small nuclear RNAs (snRNAs)
snRNAs associate with proteins (at least 50) to form 4 small nuclear ribonucleic particles snRNPs
snRNPs: U1, U2, U4, U5, U6 (4 and 6 combine)
Spliceosome is made of snRNPs + accessory proteins
These attach to pre-mRNA transcript and form complexes
Specific complexes are formed in a sequential manner
What are the steps of assembly and action of the spliceosome?
1) U1 and U2 assemble onto pre-mRNA in a co-transcriptional manner
U1 binds to splice donor site
U2 binds to start of 3’ exon (acceptor site)
2) The U1 and U2 snRNPs interact with each other to form the pre-spliceosome (complex A)
Brings ends of introns together
3) The preassembled tri-snRNP U4-U6-U5 is recruited to form complex B (tri-catalytic complex)
4) Complex B undergoes a series of rearrangements to form a catalytically active complex B
U1 and U4 are eliminated from the complex –> now made of U5, U6, U2
5) Complex B carries out first catalytic step of splicing, generating complex C which contains free exon 1 and the intron-exon2 intermediate
6) Complex C undergoes rearrangements and then carries out second catalytic step
Results in post-spliceosome complex that contains lariat intron and spliced exons
7) Release of spliced mRNA and lariat
RNA helicase unwinds RNA of lariat so RNA and U2, U5, U6 can be recycled
DSCAM gene as an example of transcript variation
Example DSCAM (down syndrome cell adhesion molecule) gene
Expressed in developing neurons
Protects neural projections in dendrites from forming connections with themselves
If protein on a dendrite is the same then it will avoid it
Some exons are always included and other exons are randomly selected
24 exons permits 38,016 protein variants
How is alternative splicing a method of regulation? Transcriptional control
Insertion/deletion of specific domains
Ex. It can regulate antibody and neuropeptide production
How is alternative splicing a method of regulation? mRNA production
Splicing can affect amount of mRNA created
A premature stop codon in an exon that is not the final exon
If this exon is included it leads to nonsense-mediated RNA degradation
Unspliced intron leads to unprocessed RNA which is not transported into the cytoplasm or if it is transported then truncated due to premature stop codon in the intron
Exons are retained or skipped
Introns are excised or retained
5’ and 3’ splice site positions are moved (cryptic splice sites) to make exons longer or shorter
What are the effects of alternative splicing?
Rate of translation of mRNA
mRNA degradation and susceptibility
Insertion/deletion of amino acids
Insertion/deletion of functional domains
Polypeptide truncation due to premature stop codon
Protein properties and functions can change due to splicing:
Make a smaller/larger protein
Soluble or membrane bound
Subcellular location changes
Affinity changes for substrate
What are the 3 groups of alternatively spliced transcripts
5’ transcript ends differ from one another
3’ ends differ from one another
Middle portions differ
5’ ends differ
5’ ends differ due to different transcription start sites
Example skipping the starting exon
Example in mouse alpha amylase gene:
In the liver transcription is initiated further down so the starting exons are skipped
Leads to salivary gland and liver having different affinities in tissues
3’ ends differ
5’ ends the same, 3’ ends differ
When different poly A sites are used for transcription termination
Example in immunoglobin chains of antibodies:
Alternative 1: polyA site is after exon 4
Alternative 2: polyA site can after exon 6 so have 2 extra exons which code for a transmembrane anchor
Centre differs
5’ ends the same, 3’ ends the same but middle differs
Can’t be explained by differential promotor use or cleavage
Example: troponin T gene in skeletal muscle
64 different ways found in different muscle types
Tissue specific splicing factors act on the pre-mRNA to decide which exons are included
Used to screen for people thought to have heart attacks as it expresses a specific muscle type of troponin T gene
Alternative splicing vs exon shuffling
Alternative splicing and exon shuffling are different mechanisms for creating diversity
Alternative splicing acts on RNA to create diversity during lifespan of an organism
Exon shuffling created diversity on evolutionary scale over many generations
What is Exon shuffling? How does it occur?
Exon shuffling: exons jump around in the genome so exons from different genes are rearranged or combined to give new gene structure
Many eukaryotic proteins are mosaics of motifs
How it occurs:
Illegitimate non-homologous recombination during meiosis - slightly misaligned
LINES (long interspersed nuclear elements): An exon nearby the LINE element can be transcribed and included in the RNA, then moved to a different location
DNA transposons (similar to LINES): can collect exons and move them into genes
What are tissue specific splicing factors?
Tissue specific splicing factors: proteins that recognise cis-acting factors within the RNA transcript
Bind to pre-mRNA before it is spliced and it decides which exons are included
Promote or inhibit splice sites in different cases
Factor binds 1st intron and promotes splicing of exon 1 to exon 2
Factor binds 1st intron and inhibits splicing of exon 1 to exon 2, allowing splicing of exons 1 and 3
Sex lethal (Sxl) gene
Autoregulates its own splicing
If sxl is present: inhibits inclusion of exon 3 which contains a stop codon = production of sxl protein
In males: No expression of early promotor so never get sxl protein
In females: Early promotor causes burst of sxl so stop codon is never included so get sxl protein
Sxl determines sex in somatic cells
Transformer (Tra) gene
Tra splicing is regulated by sxl
In males: no sxl –> tra with early stop codon –> short protein with no function
In females: sxl binds at proximal splice site in intron 1 –> prevents U2AF binding –> binds cryptic splice site in exon 2 (distal splice site) –> allows splicing out of stop codon –> tra protein is expressed
Sxl promotes exclusion of an exon
Doublesex (Dsx) gene
Dsx is responsible for sex determination
Tra is a splicing factor and regulates dsx (doublesex) gene
Dsx encodes a transcriptional repressor that determines development
Tra promotes inclusion of an exon
Females: Tra is recruited to exon splicing enhancer (ESE) by tra 2 –> tra binds to exon 4 –> recruits U2AF –> inclusion of exon 4 –> exon 4 contains stop site –> transcription termination
Dsx mRNA ends with exon 4 –> shorter protein isoform with different properties
Males: exon 4 is not included
2 isoforms of dsx protein –> regulates genes in different ways to get more female or male characteristics
Fruitless (fru) gene
Fru is responsible for sex determination
Tra also regulates splicing of fru (fruitless) gene
Encodes a transcriptional regulator that determines development
In females: tra promotes splicing from the end of exon 2 –> inclusion of stop codon –> transcription termination no functional protein (truncated protein) and no male isoform is produced
In males: tra is absent –> get splicing from start of exon 2 (before stop codon) –> excludes stop codon –> functional protein
Tra forms a male specific isoform of fruitless which is important for male specific behaviour
Explain the summary of the sex determination hierarchy in drosophila
Splicing factors can act positively (ex. Tra) to promote the use of a splice site
Or act negatively (ex. Sxl) to inhibit the use of a splice site
In females: 2 copies of X –> Sxl expression –> regulates it’s own splicing –> regulates Tra –> functional tra –> splices dsx and fru differently to get female characteristics
Sxl also regulates MSL-2 which is important for dosage compensation
In males: no Sxl expression –> no Tra –> male versions of dsx and fru to give male characteristics
How does fruitless control mating behaviour?
Fruitless controls male mating behaviour
Male orientates itself at 45 degree to female, taps abdomen of female, sings, licks and attempts copulation
How is alternative splicing used for response to signals with SLO gene?
SLO gene encodes potassium channel which is important for action potential in neurons
STREX domain –> SLO is reacts quicker to allow K+ to pass through channel
During action potential calcium is high –> binds to camkinase proteins –> regulates factor that binds to CAR region in pre-mRNA –> exclusion of STREX domain –> potassium channel is less sensitive and slows down action potential
What are the regulating factors?
ESE - exonic splicing enhancer
ISE - intronic splicing enhancer
ESS - exonic splicing silencer
ISS - intronic splicing silencer
SR proteins - Stimulate splicing
hnRNPs - hinder splicing, bind to exon silencing elements
What are SR proteins?
Serine arginine repeat regions within splicing factors
Bind 5’ splice site and promote binding of U1 snRNP to promote splicing
Can bind within exonic splice enhancers (ESEs) within downstream exon to promote U2AF binding and promote splicing
What happens if RNA polymerase is elongating at a slower or faster rate?
Splicing occurs as transcription is still going on
Rate of elongation can affect splicing pattern
Slow: more likely to include exons with weak acceptor sites
Fast: more likely to skip exons with weak acceptor site (as protein won’t bind tightly)
What controls regulation of splicing?
RNA sequences - elements within RNA that can recruit proteins
Constitutive or tissue specific trans-acting factors
(example Tra-2 is always present)
Splice site strength: ability to bind to factors (U1 / snRNP) and presence/absence of ESEs
Origins of introns early and late theory
Intron early theory:
Introns originated in prokaryotes and due to evolution lost them to have more compact genomes
Introns were kept in more complex organisms
No evidence for anything that resembles introns in bacteria
Intron late theory:
Introns only evolved in eukaryotes
Prokaryotes and archaea never had introns or spliceosome machinery
Origins of introns theory with LECA
Last eukaryotic common ancestor (LECA)
Prokaryotes invaded archaea like cell, co-opted to generate energy for the cell, became mitochondria of eukaryotes
Bacteria contained retroelements (group II introns / self splicing RNAs) which invaded the archaea genome
Happened after endosymbiosis
Created precursors of introns / origin of spliceosomal introns in pieces
How can introns be a burden to the host?
Spliceosome complex is huge and forms a large part of the genome
So need to transcribe more RNA which requires energy and time (60 nt / s)
Vulnerability as errors in splicing causes mutated proteins ex. Need recognition of cis-regulatory sequences
Roles of introns
Sequence dependent functions ex. intron contains non coding RNA like microRNA
Length dependent functions
ex. Large introns take a long time to transcribe
Splicing dependent functions
ex. interaction between splicing machinery and RNA polymerase
Life phases of an intron
Genomic intron
Transcribed intron
Intron being spliced
Excised intron
EJC-harbouring transcript (marks where exons have been spliced together)
What is a genomic intron?
Still in the DNA
Location of gene’s cis-regulatory elements
Contain transcription initiation sites (modulate main promotor action)
Enhancers, silencers, TF binding sites
Often found in most 5’ introns
40% of TF binding sites are within introns
Alternative transcription initiation due to genomic intron - AFP
Alpha-fetoprotein (AFP)
plasma protein made in the liver and yolk sack in the foetus
regulates osmotic pressure
Tissue specific expression
Alternative transcription initiation due to genomic introns
Use of upstream TSS of exon 1 or TSS in the first intron
Alternative transcription termination due to genomic intron
Intron sequences regulate polyadenylation and cleavage
Different transcription termination depending on which polyA site is used (Intron needs to be harbouring the site)
Example Flt-1 gene
Soluble form is more abundant than the membrane bound form
Membrane bound form has a later polyA site so includes exon 14 (longer protein)
Soluble form has an earlier polyA site so excludes exon 14 (shorter protein)
Nested genes of genomic introns
Introns can encode nested genes
Same orientation or reverse strand
800 in drosophila
May have their own promotor and different expression profile
Non-coding (ex. microRNA) and protein-coding genes
How does length of intron affect timing of when protein is made?
Length of intron affects the timing of when the protein is made
RNA polymerase II has an elongation rate of 50kb / min
Intron transcription may take hours
Time delay between gene activation and translation of protein
Must splice and export from nucleus before translation
HES7 gene transcribed introns
HES7 gene (mus muscula) in mice
Transcription factor
Forms a negative feedback loop
Controls timing of somite segmentation during embryonic development
Somites form vertebrae, bones, cartilage
Made in sequence as organism is growing
Oscillations in HES7 expression
Peak in oscillation forms somites so somites are formed at regular intervals
Timing of expression and feedback loops in transcribed introns
Timing of oscillations affected by transcription
Longer mRNA = transcription and translation takes longer = longer oscillations
Negative feedback loop
Delays due to time for transcription, splicing, translation
HES7 is a transcription factor and represses its own transcription
As protein levels increase, transcription decreases
Once protein levels are low, transcription begins again
Leads to oscillations in protein production
Unstable protein required
Introns are important for producing the correct period of oscillation
mRNA stability, regulatory elements for feedback loops, length of mRNA
Knockout HES7 mouse embryo, then reintroduced gene but without the introns –> oscillations occurring very frequently
How are introns spliced? What do they affect?
Splicing occurs co-transcriptionally
Linked via RNAPII C-terminal domain
Splicing can affect initiation, elongation, termination
How do introns affect initiation of splicing?
U1 of spliceosome binds to 5’ end of intron in the pre-mRNA
U1 promotes binding of pre-initiation complex TFIIH and TFIID
Intron at the start of the gene enhances transcription
How do introns affect elongation of splicing?
RNA polymerase doesn’t always elongate, sometimes falls off
Machinery to make sure RNA polymerase stays on
U1 promotes TAT-SFI which binds to RNApol machinery and enhances elongation
How do introns affect termination of splicing?
Endonucleolytic cleavage and polyA tail addition
If a potential termination (polyA) site close, CPSF protein combines
U2 binds to 3’ end so if U2 is close to CPSF, CPSF is enhanced which increases the probability that transcription will end
U1 binds to 5’ end so if U1 is close, CPSF is inhibited which prevents termination
Prevents early termination in an intron ex. a cryptic termination site
What are excised introns?
When an intron is excised it forms a lariat structure and undergoes debranching and degradation
Embedded RNA genes may be expressed in a removed intron
Can contain non-protein coding RNAs (ncRNAs), such as microRNAs (miRNAs) and small nuclear RNAs (snoRNAs)
Excised introns - mirtrons
Intron is excised (mirtron) by splicing to form pre-miR
Pre-miR is exported from the nucleus and cut up by Dicer to form microRNA duplex
Unwinded and loaded onto RISC which regulates expression of RNA
Either leads to mRNA degradation or affects translation
Excised introns - snoRNAs
60-150 nucleotides long
Fundamental to RNA modifications in archaea and eukaryotes
Modify RNAs (tRNAs, rRNA, snRNA)
Ex. Methylated before they move into the nucleolus
Released after splicing
What are EJC harbouring transcripts?
Exon junction complex
Binds 25 nucleotides upstream of exon-exon junction (where the intron was) on mRNA transcript
Acts as a marker of where the intron was
4 core proteins (MAGO, YI4, eIF4AIII, MLN51)
Present from splicing until translation
Roles of EJC
Nuclear transport
Translation activation
mRNA localisation
Nonsense mediated decay (NMD)
EJC nuclear transport role
Mature mRNAs bind to mRNA specific transport factors
Shuttles through nuclear pore complexes
Transport rates are 10x higher for spliced transcripts with EJC which increases the expression of the transcript
Spliceosome or EJC recruits ALY/REF export factor which allows it to export transcript to the cytoplasm more efficiently
EJC translation activation role
Presence of EJC on the mature mRNA enhances translation
EJC core component MLN51 interacts with eIF3 which is important for translation initiation
EJC Cytoplasmic localisation role
Subcellular regions targeted within cytoplasm
Localisation permitted by shuttling proteins
Oskar mRNA needs EJC for localisation
important for anterior posterior polarization of embryo
EJC Nonsense mediated decay (NMD) role
Prevent expression of proteins/degrade proteins with a premature stop codon as this would truncate the protein
To prevent dominant negative/gain of function proteins
Normal stop codon: Ribosome kicks off EJC from mRNA
Premature stop codon:
splicing dependent, if EJC is more than 50nt downstream of a termination codon it is termed as premature
Ribosome pauses at stop codon
Proteins bound to ribosome and EJC can interact with each other
UPF1 and UPF2 interact causing phosphorylation which tags the RNA for degradation
Only occurs when stop codon is upstream of the EJC
What are overlapping genes?
Adjacent genes located on either DNA strand sharing one or more nucleotides in coding sequence
Found in mitochondria, microbes, eukaryotes
In humans 10% of genes are overlapping
Complete/internal/embedded/nested overlaps: Small gene within a larger gene
Partial/terminal overlaps: Involving only small 5’ or 3’ overlap of coding sequences
Same strand overlapping (unidirectional)
3’ end of one gene overlapping with 5’ end of another gene
Genes may be regulated by a common promotor (so expressed at same time)
Very common in bacteria
Different stand overlapping
Convergent: 3’ ends overlap
Divergent: 5’ ends overlap. Bidirectional promotors can be active to drive expression at the same time
Examples of types of overlapping genes
Genes sharing same locus on same strand but coding for different proteins (Genes have same TSS and same first exon but rest of exons and termination sites are different)
Genes sharing same promotor region (Are not physically overlapping)
Nested gene (Gene is overlapping or falling within a single intron)
Embedded gene (Exons of smaller overlapping gene are falling within introns but not same intron)
Genes on opposite strands with overlapping locus but no overlap within the exonic region
Tail to tail overlap in the exonic region (Example only 3’ UTRs are overlapping)
Head to tail overlap involving 5’-UTRs and coding sequences
Draw image in notes
What is gene phase?
Have to consider relative phase that the overlapping genes exist in relative to another
Overlaps can be in the same reading frame or shifted 1-2 base pairs
One gene is the reference gene and create base comparisons from that
If overlapping genes result in same reading frames = in phase
What are in phase overlaps?
In phase overlaps common in bacteria and viruses
Result in the same reading frame
2 categories: involving different initiation and different termination of translation
In phase overlaps initiation
Alternative translation start site
New internal promotor formation (different transcription starts)
Genes share terminator
Different N-terminal domains
Same C-terminal domains
Ex. Bind same substrate with C-terminal but catalyse different reactions using N-terminal domains
In phase overlaps termination
Same initiator codon
Termination occurs at distinct codons
Ex CS3 genes in E. coli form 5 polypeptides that form hair like appendages in bacteria
Example of in phase initiation thermos flavus
Thermus flavus aspartokinase has different translation start sites
askA: alpha subunit to give 405aa protein
askB: starts at 3’ end of askA, beta subunit to give 161aa protein
Dependent on shine Dalgarno sequence to help initiate protein synthesis
Out of phase overlaps in bacteria
Genes overlapping in ways that don’t result in identical reading frames
Can be on same strand or different strands
Same strand:
If base/open reading frame shifted one along = phase 1
If base/open reading frame shifted two along = phase 2
If both are in phase 0 then they are considered in phase
Short overlaps are often in phase 2, large overlaps in phase 1
Due to genetic code probabilities - greater probability of stop codons so harder to get larger overlaps in phase 2
Different strand:
Phase 0 is the backwards reading frame (opposite strand)
No bias between phase 1 and phase 2 overlaps
Genes overlapping in phase 2 will produce the same amino acid sequence
Out of phase overlaps in eukaryotes
Arf (P14) and Ink4a (P16)
Tumour suppressor made of two genes: Arf (P14) and Ink4a (P16)
2 TSS/first exons (1alpha and 1beta) that are transcribed from different promotors
One for Arf at 5’ end and one for Ink4a further down the gene
Both splice to exon 2 but reading frames are different
Arf gene will stop at end of exon 2 but Ink4a will continue to exon 3
This is an out of phase overlap between the two genes
Partial overlap of genes
Small overlaps on 5’ or 3’ end
Common for prokaryotes with functionally dependent genes
Terminator site of 1 gene overlaps with initiator of another
Example of partial overlap of genes with Tryp operon
5 genes involved in tryptophan synthesis
Some have very short overlaps
Overlap by one nucleotide between the stop codon of trpD and start codon of trpE
Same with trpB and trpA (Shine Dalgarno sequence with trpB)
Translation coupling is dependent on this ratio
Allows protein to be synthesized in equimolar ratios
Proximity of trpB stop codon to trpA start codon influences trpA translation
By changing overlap between genes don’t get equimolar production of proteins
Degree of overlap can affect translation rates
What is translational recoding
Ribosomes can be directed to:
Use alternative start sites
Bypass or recode termination codons
Or site specific programmed shift of reading frame (PSRF) - can change frames while translating
What is a ribosomal frame shift?
When ribosome pauses on mRNA and moves forwards or backwards 1 nucleotide before continuing
Changes the reading frame
Get more than one protein per mRNA
Depends on the mRNA regulatory sequence and structure
All mRNA structure must be unfolded (mRNA forms secondary structure which must be unfolded for translation)
This can affect codon/anti-codon binding and leads to uncoupling
Formation of energetic barriers is important for PRF
Slippage can happen at any stage of translation
PRF can diversivy the proteome
Ribosome structure
A site=aminoacyl tRNA site that allows entry of new tRNA attached to an amino acid
P site=peptidyl tRNA site that holds the growing polypeptide chain
E site=exit of used tRNA and protein chain out of ribosome
Small subunit=binds incoming tRNA
Large subunit=facilitates protein synthesis
-1 programmed ribosomal frame shifting mRNA requirements
Very common in prokaryotes
mRNA requires a slippery sequence, a spacer sequence and a downstream stimulatory sequence
Slippery sequence: where the shift takes place
7 nucleotides
X XXY YYZ (original reading frame) –> XXX YYY Z (shifted reading frame)
XXX: 3 identical nucleotides
YYY: AAA/UUU
Z: any nucleotide but not often G
Spacer sequence: 12 nucleotides or less
Downstream stimulatory sequence:
Pseudoknots, kissing stem loop
Energetic barrier for ribosome to overcome
One of the secondary structures that the ribosome has to unfold before it translates
Aids positioning over slippery site
Causes ribosome to pause over the slippery site
-1 programmed ribosomal frame shifting when does slippage occur?
Slippage may occur during distinct points of translation elongation cycle
During accommodation of the A-site tRNA
Or during EF-G catalysed translocation (EF-G catalyses translocation of tRNA and mRNA down the ribosome)
Just after peptidyl transfer
-1 programmed ribosomal frame shifting mechanism
Elongating ribosome encounters frameshifting signal (pseudoknot) and pauses
Ribosome slips back one base
Reading frame from G GGU UUA –> GGG UUU A
Ribosome unwinds the pseudoknot and continues translating in new -1 frame
+1 PRF example
OAZ1 in yeast
Mammalian equivalent ornithine decarboxylase antizyme (OAZ)
OAZ is involved in ubiquitin independent degradation of ornithine decarboxylase (ODC)
ODC produces polyamines
Usually OAZ has stop codon but +1 PRF shift adds an extra sequence to the end that targets ODC for degradation
Autoregulation/negative feedback loop: ODC increases polyamine levels –> polyamines stabilize pseudoknot in OAZ mRNA–> causes the +1PRF on OAZ mRNA –> frameshift allows production of functional OAZ protein –> OAZ promotes degradation of ODC –> less polyamines are produced
-1 PRF example
HIV virus
Gag is produced as a 55kDa precursor protein that forms the virus particle (capsid)
-1 PRF leads to formation of 160kDa GagPol polyprotein extended ORF which encodes for a protease, reverse transcriptase and integrase
This only occurs 5% of the time
If amount of frameshift is disrupted, can impede the growth of the virus
Higher % (so makes more GagPol) it is less efficient for the cell
Drug developed to stabilize stem-loop structure and increase amount of PRF to make viruses less efficient
Why do we have overlapping genes and PRF?
Allows for genome compression
PRF provides another method to increase the diversity of the proteome (can create many different proteins from the same stretch of DNA/RNA)
Allows control of stoichiometry: PRF and shared promotors allow proteins to be expressed at stable levels relative to each other - allows for coordinated control
Advantages of overlapping genes in viruses
Small genome size so gene compression is important
Have limited space in the capsid so need to compress genome
Smaller genome and smaller capsid allows viruses to replicate faster
Disadvantage of overlapping genes
In evolution overlapping genes can cause evolutionary constraint
Want to make a mutation that allows adaptation to the environment –> changing base pairs in one gene can affect the coding sequence of the overlapping gene
Eukaryotes overlapping genes
Have larger genomes so have more types of overlapping genes
Contain introns so overlapping genes may be located in introns
More abundant different strand overlaps
Lower proportion of divergent different strand overlaps
At 5’ end there is TF binding site and promotors, enhancers and overlapping which may lead to constraint
Prokaryotes overlapping genes
Features exons primarily so exon overlapping is common
Unidirectional overlapping is the most common - operons could be a driving force
PRF may be more prevalent due to genome size restrictions
Gene regulation by antisense transcription
Antisense transcription can impact gene expression at 3 different stages: transcription initiation, during transcription and post transcription
Antisense is the non-coding strand
Antisense transcription affecting initiation
ANRIL on antisense strand recruits polycomb complex to induce histone modification or DNA methylation–> represses transcription initiation
Antisense transcription affecting during transcription
What would happen when two overlapping antisense genes are transcribed at the same time?
RNA polymerases would collide
Antisense transcription affecting post transcription
BACE1 antisense strand makes RNA that binds to base RNA which protects RNA from being degraded by an miRNA
Sense-antisense pairs as self-regulatory circuits
Fine tuning: antisense expression slightly modulates expression of the sense gene
Bistable switch: strong mutal repression
Can go quickly from off state to on state of a gene
What is RNA editing?
RNA editing are mechanisms that change the sequence of RNA transcripts encoded by genes in a wide range of organisms
Only found in eukaryotes
RNA editing vs RNA splicing
Overall similarities:
mRNAs, tRNAs, rRNAs are substrates for RNA editing and splicing
Alternative splicing and editing generate protein diversity
Splicing and editing are developmentally regulated
Overall differences:
Splicing removes RNA sequences encoded by a gene
Editing adds/changes the information encoded by a gene
Splicing is often a RNA catalysed reaction (snRNPs), while editing is always protein catalysed
RNA editing in trypanosoma parasite
Has a kinetoplast (extended mitochondria)
Editing involves insertion or deletion of uridines
Two thirds of mitochondrial genes are edited
Information for editing is encoded by gRNAs transcribed from minicircles
RNA editing process can be developmentally regulated to alter the proteome
Editing occurs post-transcriptionally
The kinetoplast
Kinetoplast: network of circular DNA inside the mitochondria that contains many copies of the mitochondrial genome
It has around 4,000,000 bp
DNA forms looping structure of maxicircles and minicircles
10,000 minicircles (1kb in length) and 50 maxicircles (20kb)
Maxi circles encode components of the mitochondrial oxidative phosphorylation machinery
Discovery of RNA editing in COXII
Cytochrome oxidase II (COXII) gene is very well conserved in eukaryotes
Found premature stop codon in a very well conserved gene
Sequenced RNA
Found insertion of 4 uridines in RNA that removed the premature stop codon
Cytochrome oxidase III (COXIII) gene
Found a lot of uridine insertions (heavily edited)
Editing occurs post-transcriptionally
What is the function of insertion or deletion of uridines / RNA editing?
Form start codons
Correct frameshift mutations
Create complete ORFs
Remove premature stop codons
Form appropriate stop codons
Minicircles
Minicircles encode guide RNAs (not like ones from CRISPR)
About 1kb in size
Thousands of copies in kDNA (10,000)
Heterogenous sequences
Maxicircles
Encodes mRNA and rRNA genes
About 22Kb size
Tens of copies in kDNA (50)
How do minicircles and maxicircles edit RNA?
Mitochondrial genes and minicircles are transcribed as polycistronic RNAs
Guide RNAs produced from minicircles bind to mRNAs produced from maxicircles and guide RNA editing
Mature edited mRNA is translated
All information to remove or insert genes comes from circles
Mechanism of Trypanosome (T. brucei) RNA editing
Annealing of guide RNA that has base pair complementarity to target mRNA
Insertions: template guides TUTase to insert uridine into mRNA, ligation and translation
Deletions: exonuclease cuts mRNA and removes uridine, ligation and translation
Editing occurs post-transcriptionally
Structure of gRNA
Triphosphate at 5’ end
Polyuridine at 3’ end added post-transcriptionally
Anchor sequence - base pairs to target pre-mRNA
Guiding sequence - directs insertion or deletion of uridine into mRNA
Editing occurs post transcriptionally and can lead to either insertions or deletions
These guide RNAs should not be confused with ones used in CRISPR!
Editosome
20S editosome
U-insertion and U-deletion domains
RNA binding and zinc-finger proteins to help bind mRNA
Enzymes used for cutting or adding in uridines - TUTase, exonuclease, ligase
How does alternative RNA editing generate protein diversity in trypanosomes?
Trypanosome takes part in RNA editing to rapidly alter mitochondrial function and change its metabolism for when it is in different stages of its life cycle
Editing causes shift from trypanosome quiescent to proliferative mode for more mitochondrial activity to give correct proteins
Example COXIII gene is heavily edited to make COXIII protein but it can also be alternatively edited to make AEP-1 by changing N-terminus
What is AEP-1?
AEP-1 binds in the kinetoplast
Important for integrity of kinetoplast DNA network
Affects fitness of trypanosomes
RNA editing in slimemold
RNA editing in Physarum/slimemold
Additions of GU and CU
C to U changes (deamination)
Extremely accurate
Occurs co-transcriptionally (unlike trypanosomes occurs post-transcriptionally)
RNA editing events in plant mitochondria and plastids
Mitochondria: C to U transitions occur, U to C transitions are rare, mRNAs, rRNAs, tRNAs are edited, high frequency (2%), post transcriptional
Plastids: C to U transitions occur, U to C transitions are rare, mRNAs are edited, low frequency (0.04%), post transcriptional
Editing tends to be higher in mitochondria in plants
RNA editing mechanism in plants
Protein binds to specific sequence that brings in transaminase to alter RNA
NO guide RNA that has base pair complementarity
Mammalian mRNA editing
C to U and A to I changes occur
Apolipoprotein B mRNA editing (C to U editing)
Apolipoprotein has C to U editing that produces 2 proteins: one specific to liver and one specific to intestine
RNA editing of C to U produces premature stop codon which truncates the protein and leads to protein specific to intestine
Need formation of hairpin loop and 5’ efficiency element to recruit ACF that performs editing
Adenosine deaminases (ADARs)
Adenosine deaminases that act on RNA (ADARs) are a class of RNA editing enzymes
In vertebrates there are 3 different types of ADARs proteins
Contain 1-3 double stranded RNA binding motifs
ADARs found in a wide range of organisms from yeast to mammals
Converts adenosines to inosines (A to I)
mRNAs, tRNAs, viral RNAs and non-coding RNAs can be substrates
Alters specific codons to change amino acids or changes stop codons
ADAR structure and mechanism
ADAR is a single protein (not a complex)
Sequence in an intron adjacent to an exon that allows recruiting of ADAR protein
Editing of serotonin receptor in mammals
Higher editing = reduced efficiency of serotonin receptor
Over and under editing of this mRNA are associated with genetic diseases (Prader-Willi syndrome) and with depression
Editing of AMPA glutamate receptor in mammals
Glutamate receptor must be edited from glutamine to arginine
Lack of editing of GluA2 Q to R results in a large influx of Ca2+ into neurons and causes death
Low levels of GluA2 Q to R has been observed in patients with major depressive disorder and schizophrenia
RNA editing and development ADAR1
ADAR1 mutants are embryonic lethal
Phenotypes include impaired haematopoiesis and defects in liver formation
Embryonic stem cells have high editing levels
ADAR1 levels impact efficiencies of cellular reprogramming
Exact roles and contributions to these developmental defects are not yet clear
RNA editing and cancer
ADAR1 down regulation can reduce proliferation of chronic leukaemia in mice
ADAR2 down regulation inhibits cellular proliferation in different types of brain tumours
ADAR silencing in breast cancer leads to less cell proliferation and more apoptosis
Higher levels of RNA editing in the nervous system
Editing is important in the nervous system/brain
ADAR mutants (knock-outs) lead to brain related phenotypes (behavioural phenotypes)
3 ADAR proteins in mammals
ADAR2 mutants die from seizures
ADAR3 can’t do RNA editing but can block RNA editing by ADAR2, it is only expressed in the brain
Alu repeats in the genome are heavily edited
Alu harbouring genes are also enriched for neuronal genes
RNA editing as a driving force of brain evolution
Theory
More editing occurs in the brain
The bigger the brain, the more RNA editing is occurring
More RNA editing = more developed brain and nervous system
Intensity of RNA editing is higher in humans than in mice
Alu repeats suggests increase in editing from monkeys to humans
Why does the artery have a lot more editing compared to other tissues
Because it has to adapt to a wider range of conditions
ADAR1, ADAR2, ADAR3 functions
ADAR1 edits repetitive sites
ADAR2 edits nonrepetitive sites
ADAR3 acts as an inhibitor of editing
How much RNA editing occurs in squids compared to mammals?
Mammals have around 200 reported recoding sites (RNA editing that changes the protein sequence)
More than 50,000 recoding sites in the squid nervous system
Why are there high levels of editing in squid and flies?
Squid and flies have higher editing than humans
Cold blooded organisms use RNA editing to rapidly respond to changes in the environment (ex. temperature)
When temperature decreases, editing increases in a potassium channel in drosophila
Theory: A to G change (G is like I) results in replacement of large R group for a small one which reduces activation energy so enzyme can work at a lower temperature