GILESTRO - te cooption Flashcards
Transposable Elements (TE) and other repeats account for …
50% of the human genome
2 main methods of transposition:
Class 1: retrotransposon
* In order to propagate, retrotransposon sequences need to be transcribed into RNA first, which is reverse transcribed into cDNA, to then be transposed into another location in the genome
* goes through an RNA intermediate step
* Copy and paste mechanism
Autonomous transposon
(has their own reverse transcription mechanism)
* Retro viral origin – some viruses have colonized mammalian genome and incorporated themselves in the genome (has LTR)
* LINES – 21% of human genome
Non-autonomous retrotransposon
(depends on other retrotransposons’ reverse transcription mechanism)
* SINES
* SVA transposon
Class 2: DNA transposon
- Does not have an RNA intermediate
- Cut and paste mechanism: the transposon is excised from one location and reintegrated elsewhere
TE lose their ability to transpose overtime due to accumulation of mutations in the ORF that encode for proteins that regulate the transposition (ex. loss of function mutation in the transposase domain of DNA transposons)
TE are primarily parasites of the genome and must be silenced by:
- DNA methylation
- Histone methylation (primarily H3K9me3)
- Different types of small RNAs (esp. pi-RNAs – interact with argonauts to repress TE)
- KRAB-ZINC-FINGER proteins
o KRAB-ZFPs are the largest transcription factor family in mouse & human
o characterized by two domain structures: N-terminal KRAB domain & a tandem array of C2H2 zinc fingers at the C-terminus.
o The KRAB domain mediates the recruitment of TRIM28 which functions as a scaffold for other repressive histone-modifying and binding factors, including the histone H3K9me3 methyltransferase, that catalyze heterochromatin formation and transcriptional repression
o KRAB binding to TE lead to repression of TE expression
Timing and context of transposition
- Transposition in the germ-line cells can lead to the mutation being inherited
- Transposition in somatic cells which can’t propagate into offspring won’t lead to the mutation being inherited but will still have an effect on the individual
Mutualism between transposon & host genome
Some TEs may present some peculiar features which make them functionally useful for host genomes, leading to TE co-option (in some cell types, TE will not be repressed or methylated anymore, but will be in an active state to serve its useful functions – TE co-opted by cell)
Useful TE functions (that can lead to TE co-option):
- TE-derived promotors and enhancers
- TE as TAD (Topologically Active Domains) boundaries
- TE-derived lncRNA (long-non-coding RNA)
- TE provide new transcription factors by domain fusion
- TE-derived promotors and enhancers
- Some TE are rich in TF binding site for important TFs in a specific cell type
- Those TEs are co-opted to serve as a promoter or and enhancer for the nearby gene
Ex: SINE-VNTR-Alu Elements (SVAs)
* Most recently discovered group of transposons
* Usually form 3 kb elements consisting of
o Hexamer repeats
o Alu-like element
o VNTR element
o SINE-R element
* Hominoid specific (great apes)
* 6 sub-families of SVAs (A-F) are defined by small differences in the sequences
* SVA E & F are human exclusive because they are co-opted in the human brain neocortex & hippocampus (not repressed/ don’t have any methylation)
o These human specific SVAs are enriched with TBR2 binding sites
o TBR2 regulates a family of neuron progenitors called intermediate neuronal progenitors (INPs)
o Therefore, SVA E & F can serve as TBR2-modulated enhancers, which allow for increased proliferative ability of intermediate neuronal progenitors (INPs) in the human hippocampus
o This is important for the human hippocampus development. This co-option allowed Human hippocampus to undergo an evolutionary expansion during the primate evolution. Human hippocampus exceeds its predicted size (according to brain size) by 50% compared to other primates
* If the SVA is repressed, the level of INP gene expression is more comparable between humans and other primates (ex. chimpanzees)
* Therefore, the presence of transposon (SVA) has really re-shaped the gene expression & the regulatory network that is involved in hippocampus development in humans
* Hippocampus is important for:
Episodic memory
Navigation – specific neurons encode locations
Learning
Emotional behavior (projects to amygdala and can regulate cortisol)
Adult neurogenesis
Basis of TE co-option
if the presence of a transposon is found to be useful, it will be kept active and won’t be repressed in the cell
- TE as TAD (Topologically Active Domains) boundaries
- TADs are a sub-region of chromosomes where promotor-enhancer interaction can take place (highly self-interacting genome regions which play a critical role in regulating gene expression in the cell)
- DNA sequences within a TAD physically interact with each other more frequently than with sequences outside the TAD
- regulate gene expression by limiting the enhancer-promoter interaction to each TAD
- TAD boundaries are usually defined by binding of CTCF
- Some TEs (SINES in mouse & ERVs in humans) provide sources for CTCF binding sites, so they are able to provide novel TAD boundaries via their insertion into the genome. Change in TAD boundaries result in change in promotor-enhancer interactions and new gene expression patterns.
- TE-derived lncRNA (long-non-coding RNA)
- Some TEs contain sequences that are transcribed into lncRNA
- lncRNA regulate gene expression
o Ex. Xist - long non-coding RNA that regulate X chromosome inactivation – acts in cis because silences the X chromosome that it has been transcribed from - 400 active lncRNA derived from transposable elements
o Can Act in both cis or trans
o Cis-acting lncRNAs regulate the expression of target genes that are located at or near the same genomic locus,
o trans-acting lncRNAs regulate the expression of target genes at another independent chromosomal loci - TE-derived lncRNA are activated at specific developmental time points, in specific cell types (therefore, TE-derived lncRNA are important for development)
- TE provide new transcription factors by domain fusion
- TE all encode for transposase which has a DNA binding domain
- Transposition of TE into ORF that is in frame with another protein can cause fusion of the TE’s transposase DNA binding domain with the protein. This allows the protein which previously does not bind DNA due to the lack of DNA binding domain to now be able to become a transcription factor due to its novel ability to bind DNA.
- Ex: the PAX family of transcription factors (one of the most important neurodevelopmental TF in mammals & primates) has evolved from this type of phenomenon
- TE transposition & accumulation of mutation in the catalysis domain of transposase leads to loss of transposase activity. The only remaining functioning DNA binding domain can then become part of a bigger nearby protein, allowing that protein to have a DNA binding domain, thus able to bind DNA and become a novel TF
Effects of TE insertion
- TE insertion may lead to the evolution of a new trait
- The inserted TE will accumulate mutations and thus become polymorphic across populations. The different polymorphisms can then be associated to different traits across populations.
Ex of TE insertion effects
Loss of tails in primates (great apes)
* Involves the TBXT gene – a highly conserved gene important for development
* TE transposition results in the insertion of a 2nd alu element, alu-Y, in the TBTX gene in the opposite direction/orientation of the already existing alu-X. This allows for dimerization of alu-X and alu-Y during splicing which causes cutting off of exon 6 that is in between the 2 alu elements. The effect of loss of exon 6 is loss of tails in great apes.
o Experiment: If remove exon 6 from mouse TBXT = no tails
* Evolution selects for this insertion due to the evolutionary advantage of not having tails in great apes. (Great apes TBXT co-opted the alu-Y transposon)
Other examples of human-specific TE derived traits :
* Amylase in saliva in hominoids (exist by Insertion of ERV )
* Prolactin hormone production in endometrium (regulated by MER39 and MER20)
* Corticotropin releasing hormone & placensin in the placenta (regulated by THE1B (primate specific) and other TEs controlling expression)
Summary:
- TEs make up for the majority of the human genome (class1/class2)
- Transposition in germline cells is necessary for inheritance
- TE repression is performed in many ways
- Mutualistic relationship between TE and host genomes can lead to TE co-option
- TE co-option: enhancers-promoters/ TAD boundaries/ TE-derived lincRNAs / TE-derived transcription factors
- Many important human phenotypes and traits have a TE origin