L6: Transposable Elements Flashcards

1
Q

What was the 1,000 genomes project?

A

Effort to sequence the genomes of ~ 1000+ genomes from across the globe, resulted in 2,504 individuals across 26 distinct populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What were the major findings from the 1,000 genomes study? (6)

A
  • 88 million variants in 2,512 genomes
  • 84.7 million single nucleotide
    polymorphisms (SNPs)
  • 3.6 million short insertions/deletions (indels)
  • 60,000 structural variants (including big copy number variations)
  • Many segregating SNPs (between populations)
  • Rare alleles are often restricted to sub-populations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How much does a typical genome differ to that of the reference human genome?

A

We find that a typical genome differs from the reference human genome at 4.1 million to 5.0 million sites.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What accounts for the vast majority of these variants found between a given genome and the reference genome?

A

SNPs and short indels account for >99.9% of variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Aside from these SNPs and short indels describe the remaining sources of variation

A

2,100 to 2,500 structural variants affecting ∼20 million bases of
sequence
* ∼1,000 ‘large’ deletions,
* ∼160 copy-number variants,
* ∼915 Alu insertions,
* ∼128 L1 insertions,
* ∼51 SVA insertions,
* ∼10 inversions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of these listed sources of variation are transposable elements?

A

Alu, L1, SVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What in our genome consists mostly of transposable elements?

A

Most of our genome’s ‘Junk DNA’ consists of Transposable Elements. Intriguingly, regulatory sequences often originate from TEs. They mimic gene sequences surrounding them; If they are in a blood cell they will express promotors for red blood cells etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When do transposable elements remain in our genome?

A

They stay in the genome if they infect a germ cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are transposable elements (TEs)?

A

TEs are repetitive genetic sequences that once had or still have the ability to transpose, that is, to mobilise and insert elsewhere in the genome. In contrast to genes, TEs are enormously diverse across species and often species-specific

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How much of our genome are transposable elements?

A

45%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can TEs be classified?

A

Broadly, TEs can be divided into two classes: DNA transposons excise and insert a DNA intermediate when they transpose (‘cut and paste’), whereas retrotransposons reverse transcribe RNA intermediates prior to integration (‘copy and paste’). DNA transposons are few and inactive in most mammals whereas retrotransposons are abundant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can retrotransposons be further classified?

A

Retrotransposons are further classified by whether they contain long terminal repeats (LTRs). Most LTR-containing retrotransposons in mammals are endogenous retroviruses (ERVs). Frequent recombination between LTRs leaves behind many solo LTRs in the genome.

Retrotransposons lacking LTRs include autonomous long interspersed elements (LINEs) and non-autonomous short interspersed elements (SINEs), which require LINE-derived proteins for their mobilization. In addition to LINEs and SINEs, humans encode additional primate-specific composite elements called SINE variable-number tandem-repeat Alu (SVA) elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where do TEs originate?

A

The origin of most TEs is uncertain. Whereas ERVs are likely to have arisen from ancient viral infections, some non-LTR retrotransposons may have evolved from self-splicing group II introns in bacteria. These group II intron TE predecessors are still mobile in eukaryote organelles and gave rise to the spliceosome, thereby contributing to eukaryote evolution.

Many SINEs arose from cellular RNAs such as tRNAs. Whereas SINEs evolved several times independently, LINEs are all related to each other and can be traced back to eukaryotes. RNA-binding domains of diverse TE-encoded proteins even occur in all cellular life

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Do mammalian TEs still jump?

A

Although in most mammals DNA transposons and the majority of retrotransposons are inactive, some copies of distinct retrotransposon families can still mobilise including LINEs (L1; also known as LINE-1), SINEs (B1, B2) and ERVs (intracisternal A-type particles (IAPs) and early transposons (ETns)) in mice and L1 and Alu in humans.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

To what extent do the different TEs make up our genome?

A

L1: 16.9%
Alu: 10.6%
SVA: 0.2%
Other non-LTR retrotransposons: 6%

LTR retrotransposons: 8.3%

DNA transposons: 2.8%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Comment on the length and activity of LINE1

A
  • Full length LINE 1 elements are 6 KB long
  • 99.9% of LINE1 insertions are truncated on the 5’ side (=inactive)
  • Active throughout Primate evolution
  • Active in humans
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Describe the Line1 evolution

A

L1 elements replicate via an RNA intermediate that is copied into genomic DNA at the site of insertion. This mechanism of replication is not very efficient and generates mostly defective copies that are truncated at their 5′ end

These copies can be classified into families of hundreds to thousands of elements based on the shared nucleotide differences they inherit from their common progenitor (or group of closely related progenitors). Because the vast majority of L1 inserts are pseudogenes, they accumulate mutations at the neutral rate

Consequently, older families are more divergent than younger ones. Phylogenetic analyses of L1 families in murine rodents and in primates have shown that, over the long-term, a single lineage of L1 families amplifies and evolves, one family replacing its predecessor as the dominant family. Families of closely related variants can occasionally coexist for short periods of time until one family prevails and dominates the replicative process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How conserved are LINEs?

A

They are human specific, other species have their own. Newborn children will have lines that no one else has.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What accounts for the ‘bulk’ of retrotransposition in the human population?

A

There are 80–100 ‘hot’ retrotransposition - competent L1s in an average human being.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

WHat LINE is currently active?

A
  • Currently L1PA1 (=L1Hs) is active
  • ~1500 L1Hs insertion in humans
  • ~128 non ref L1 insertions (not in the reference genome)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the composition of LINE1 elements?

A

A full-length element is 6 Kb long and contains a 5′ untranslated region (5′UTR), two open-reading frames (ORFI and ORFII), and a 3′UTR. The 5′UTR has a regulatory function, ORF1 has an unknown function (RNA binding?), ORF2 encodes Endonuclease and Reverse transcriptase. L1 has its own promoter (5’UTR) which translates the whole sequence and another regulater of which the function is unknown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Where did LINE1 stem from?

A

Comes from a virus which infected vertebrates a long time ago; no longer exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Are LINE1s autonomous? What does this mean?

A

They are autonomous as they can retrotranspose themselves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are ALUs? From where are they derived?

A

They are SINEs: Short interspersed Nuclear elements ~300bp long. Alus are derived from 7SL RNA (Signal Recognition Particle RNA)

25
Q

How conserved are Alus?

A

Alus are primate specific

26
Q

How frequent are Alus in our genome?

A

1.5 million Alu elements in the human genome (~11% of our genome)

27
Q

Are Alus autonomous?

A

Non-autonomous: Needs L1 for retrotransposition

28
Q

Are Alus still active?

A

Still active (and polymorphic) in the human genome

29
Q

How well documented are Alus in the HRG?

A

~900 non-reference Alus/ individual

30
Q

What are SVAs composed of?

A

Sine-VNTR-Alus; composed of a Sine, variable number of tandem repeats, Alu, CCCTCT heximer

31
Q

How conserved are SVA elements?

A

Newest retrotransposon in our genome, emerged between rhesus and orangutans. Around 50 VNTR elements in our and chimp genome, at some point got its own promotor

32
Q

Do gibbons have SVAs?

A

Gibbons are considered as lesser apes as they’re quite a variable class. They also have SVA but with two other components which also expanded rapidly

33
Q

How many SVAs are in our genome? How many are human specific?

A

~ 3000 in our genome, ~ 1500 human specific

34
Q

Are SVAs active?

A

Active and polymorphic in humans (~50 non-ref SVAs in each individual)

35
Q

Are SVAs autonomous?

A

Non-autonomous: Needs L1 for retrotransposition

36
Q

What function do LTRs carry out?

A

Long tandem repeats (LTR)s are promoters for the transcription of the viral proteins- Human endogenous Retrovirusses (HERVs). These LTRs function as promoters for HERV expression, have strong RNA regulatory sequences and contain transcription factor binding sites.

37
Q

What are the composition of HERVs?

A

5’ LTR, gag, pol, △env, 3’LTR

HERVs complete structure is the same as that of exogenous retroviruses, consisting of four genes (gag, pro, pol and env) flanked by two long-terminal repeats (LTRs) (Figure 1A)
These elements are around 10kB long

38
Q

Are HERVs active?

A

HERV-K has been active recently in human evolution (~ 150.000 ya; emerged ~30MYA) but are now thought to be extinct (=not able to retrotranspose).

39
Q

What are HERVK-proteins associated with?

A

HERVK-proteins are expressed in pathological conditions. Still make viral proteins etc; some environments can increase this activity; our cells can recognise this and react to this. Some diseases are associated with HERV activity and its unclear whether its a cause or consequence of pathology

40
Q

How are HERVs different to LINE1 in their composition and likely origin?

A

More similar to our viruses today

41
Q

How can you know whether an element is currently active?

A

If you see polymorphic regions in the genome = they are active or have recently been active

42
Q

How do solitary LTRs come about?

A

Solitary LTRs originate from full length HERVs. Many solitary LTRs are present in our genome. Each HERV-family gives rise to a specific solitary LTR-type
* HERVK: LTR5hs; HERV9: LTR12; HERVH: LTR7,….

Arise due to homologous recombination events between the identical LTRs flanking proviral genes Gag , Prot , Pol , and Env . Each LTR harbors polyadenylation signal, enhancer, and promoter elements, and can initiate transcription of the downstream genomic loci.

43
Q

What resulted from one of the last retrovirus infections?

A

HERV-K

44
Q

What is HER-K associated with?

A

HERV-K is pathologically expressed in brains of patients with ALS

45
Q

What is ‘the mothership’ of retrotransposition? Describe its mechanisms

A

LINE1- once transcribed and translated into L1ORF1p and L1ORF2p and the L1 ribonucleoprotein (RNP) complex, they are imported back into the nucleus and can induce reverse transcription at the site of integration.

Once the reach a target site they cleave and unwind one strand of DNA and begin synthesis then cleave the second strand and through this insert a new LINE-1 copy. Might be recognised as a virus at this point and regulatory mechanisms might cause it to fall off. Once it is there however it is very hard to remove

46
Q

Describe how Alus and SVAs could hijack LINE1-ORF2

A

In this model, Alu is docked on ribosomes and captures the L1 ORF2 protein as it is translated from an active L1 element mRNA. The polyA tail of Alu and SVA elements competes with the polyA tail of L1 for binding of nascent ORF2. By capturing ORF2p at the ribosome, Alu can efficiently substitute its RNA for the normal L1 mRNA during the process of target primed reverse transcription (TPRT) that occurs at sites of integration on chromosomes. SVA might use a similar mechanism, perhaps by first hybridizing to an active Alu RNA.

46
Q

Why is LINE1 termed the mothership?

A

Alus and SVAs hijack LINE 1-ORF2 for retrotransposition: The parasites of the parasite.

47
Q

Give 8 genomic Implications of transposable element insertions

A
  • Direct gene regulation
  • Creation of new transcripts
  • Creation of new promoters
  • Inducing splice variance
  • Affecting mRNA stability
  • Genomic Instability
  • Changing epigenetic environment
  • Changing the 3D genomic structure (G4)
47
Q

How frequent would this hijacking occur?

A

From 3000 L1 retrotranspositions, 300 would be hijacked by Alu elements and only cca 1 by another mRNA

48
Q

How could transposable element insertions affect direct gene regulation?

A

A TE could be inserted upstream of a gene to chnage regulation of that gene as a ‘viral enhancer’. This is seen in SVA-mediated regulation of gene expression. When they are unrepressed you see a huge drop in expression of surrounding genes.

49
Q

How could transposable element insertions create new transcripts?

A

Transposable elements can be a source for long non-coding RNA; SVAs can do this but LTRs are really known for this.

50
Q

How could transposable element insertions create new promoters?

A

Transposable Elements frequently act as new promoters e.g LTRs

51
Q

Give an example of how transposable element insertions induce splice variance

A

IRGM (Immunity- related GTPase M) gene consistes of three Irgm genes in succesion in mice, in humans there was one remaining. It was disrupted by an Alu insertion in the last common ancestor of NWM and human, disrupting the ORF causing the gene to be truncated and pseudogenised. Multiple stop codons and frameshift mutations accrue in all Old World and New World monkey lineages.

Three mutation events restore the IRGM gene in the common ancestor of apes and humans: LTR12 (from HERV9) insertion resurrected the IRGM gene by integrating and serving as a new promoter, a single-nucleotide mutation that introduces a new ATG codon after the Alu repeat and the loss of a stop codon that is shared with Old World monkey species. The latter event is polymorphic in orangutans rendering both functional and nonfunctional copies in this species.

52
Q

How could transposable element insertions affect mRNA stability?

A

Transposable elements in the 3’ UTR decrease gene expression level / stability of mRNA

53
Q

How could transposable element insertions affect genomic instability? Give an example

A

LINE-Alu-VNTR-Alu-like (LAVA) elements, is an SVA-like element on the loose in the gibbon genome. Gibbons have a shattered genome, with each species having a different number and organisation of chromosomes. This plasticity and the evolution of the gibbons is likely characterised by defects in epigenetic repression of these TEs.

54
Q

How could transposable element insertions change the epigenetic environment?

A

KAP1 (KRAB-associated protein 1) and the histone methyltransferase ESET are necessary and required for DNA methylation. Endogenous retroviruses (ERVs) undergo de novo DNA methylation during the first few days of mammalian embryogenesis via this mechanism in order to silence them.

55
Q

How do somatic TE insertions arise?

A

Somatic retrotransposition can happen at any time during embryogenesis. Retrotransposition events that occur in early pluripotent progenitor cells will result in somatic mosaicism: these unique cells will contribute to all tissues of the body of the individual, including the germ line.

56
Q

How can variability arise following somatic TE insertions?

A

Somatic retrotransposition that happens after germ-layer specification and organogenesis, however, results in germ-layer- or tissue-specific insertions. These will not contribute to the germ line. Somatic retrotransposition increases as neural stem cells differentiate into neurons and results in neurons with unique genomes. Variability exists between the rates of retrotransposition and regions in which it occurs between individuals. High rates of retrotransposition events seem to occur in the hippocampus in some individuals

57
Q
A