L4: Segmental Duplications and Fusion Transcripts Flashcards

Question

If a segmental duplication is not evolutionarily beneficial, what else can arise from it?

Answer 1

Segmental duplications can lead to serious disorders: Highly repetitive and unstable regions are much more prone to deletions and duplications. Unequal crossover events can lead to the loss of genes that have acquired a critical new function.

Answer 2

Copy number variations in multiple human-specific genes have been linked to neurodevelopmental disorders: * Developmental delay * Autism * Schizophrenia * Epilepsy * Spinal muscular atrophy * Microcephaly * Macrocephaly * Others...

Answer 3

Increasing genomic complexity can lead to segmental duplications which can further this complexity and later give rise to new genes and functionality. This can also give rise to genetic deletions and duplications which can then give rise to neurological disorders.

Answer 4

The human genome has many regions that have undergone a high amount of segmental duplications or other genomic rearrangements over evolution. These regions are also prone to continued rearrangements and these can cause genetic disease Knowing the evolutionary history of these regions can help us understand what goes wrong in these diseases

Answer 5

While structural rearrangements can place whole genes or parts of genes in new contexts so new “fusion transcripts” arise, “fusions” between two adjacent genes are very common Basically all pairs of adjacent genes show some level of transcriptional readthrough, but usually at very low levels (~100 times less than the two full-length genes). But in some cases, such fusion genes could become evolutionarily beneficial and take on a novel function

Answer 6

There has been much focus on the role of fusion transcripts in cancer, but they are also abundant in normal tissues (e.g. brain).

Answer 7

(The long-range loops shown on the karyoplot likely come from DNA sequences that have undergone structural rearrangements, and their new location is not properly marked in the reference genome.)

Answer 8

Fusion transcripts vary in expression level and frequency in the population Some are widespread, others are found in very few individuals

Answer 9

Could differences in fusion transcript expression underlie disease susceptibility in some cases?

Answer 10

There is a tendency for fusion transcripts to cluster in regions enriched with segmental duplications, although they are also found in non-duplicated regions.

Answer 11

17q21.31 is a region enriched with segmental duplications

Answer 12

The region can occur in direct orientation (H1) or inverted orientation (H2) in the human population (25%). Contains multiple genes implicated in neuronal functioning including CRHR1, MAPT and KANSL1. E.g H1: CRHR (=>) MAPT(=>) KANSL1 KANSL1 (=>) CRHR (<=) MAPT(<=)

Answer 13

There are 8 structural haplotypes (sets of DNA variations, or polymorphisms, that tend to be inherited together) within this region. This reflects that there is a lot more variation hidden when considering one reference human genome.

Answer 14

Region containing CRHR1 and MAPT: [g] Region containing KANSL1: [y] Region adjacent to that containing KANSL1 in original models: [b] H1': Three haplotypes: [g >] [< y] [b] [b] [g >] [< y] [b] [g >] [< y] [b] [b] [b] H1D: Two haplotypes: [g >] [< y] [y] [b] [b] [g >] [< y] [y] [b] [b] [b] H2': Two haplotypes: [y >] [< g] [b] [b] [y >] [< g] [b] H2D: One haplotype: [b] [y >] [< g] [y] [b]

Answer 15

17q21.31 microdeletion syndrome (Koolen de Vries syndrome): * Developmental delay, intellectual disability * Cheerful, social disposition * Distinctive facial features * Epilepsy * Hypotonia * Cardiac, kidney and skeletal abnormalities

Answer 16

When sequencing for the reference genome, fragments of DNA ~300,000 bp were Sanger sequenced and assembled together (“contigs”). The consensus sequence was taken and transposable elements or SNPs not alligned with the consensus were not integrated into the reference genome. The method of assembly does not allow for correct mapping of loci containing recent duplications. The reference genome is used, however, when examining the genome and sequencing a given persons DNA; take DNA and map to reference genome.

Answer 17

Whole genome sequencing can be used and coverage/ density plots can show the number of reads that were mapped in that location in the reference genome. Regions with increased / reduced coverage indicate duplications / deletions In areas of duplication you should see a rise in the density plot. This is meant to represent how many sequences have mapped to that region of DNA, can see how many copies someone has in a given location. Doesn’t tell us where they are on the genome. Analysing coverage from sequencing data therefore tells us about copy number, but not about orientation and arrangement of segmental duplications

Answer 18

Copy number and orientation can be detected with fluorescence in situ hybridisation (FISH). FISH uses fluorescent DNA probes to target specific chromosomal locations within the nucleus, resulting in colored signals that can be detected using a fluorescent microscope.

Answer 19

Aligning sequences between different haplotypes allows us to estimate the time to most recent common ancestor and reconstruct the evolutionary trajectory of the region. Then H2 (“inverted”) orientation is the ancestral haplotype. The inversion occurred independently in humans and chimpanzees. This regions is prone to recurrent inversions. H2 only: New world & old world monkeys H1/H2 polymorphic: Orangutang, Gorillas, chimpanzees, humans

Answer 20

From our H2 ancestor in western Africa, there was a H1-H2 diverge 2.3 million years ago which migrated south in Africa. H1' carriers also migrated to east Africa. There was also an African-European H2 diverge with the out-of-africa migration. Included in this was a H2'-H2D diverge 1.3 million years ago and a H1'-H1D 250K years ago. H1' carriers migrated to asia and and the other migrated to europe ????

Answer 21

When comparing genetic sequences across the locus, we see that H2D individuals are extremely similar to each other (extremely low genetic diversity). This is suggestive of a recent bottleneck followed by a population expansion or selective sweep.

Answer 22

Microdeletion syndrome is caused by loss of the gene KANSL1. Most of the symptoms of Koolen de Vries syndrome have been attributed to loss of KANSL1. KANSL1 is part of an epigenetic modifying complex, which influences gene expression via H4K16 acetylation. Koolen de Vries syndrome neurons show synaptic defects due to oxidative stress and proliferation defects.

Answer 23

17q21.31 is also linked to risk of neurodegenerative diseases: Genome-wide association studies (GWAS) look at single nucleotide polymorphisms (SNPs) throughout the genome, and assess which of these are statistically linked to disease. There are certain SNPs found more often on H1 haplotypes, and others found more often on H2. H2-associated SNPs have been linked to: o Reduced incidence of Parkinson’s disease, Alzheimer’s disease, and progressive supranuclear palsy o Increased fecundity (reproductive potential) o Larger intracranial volume Thus it could be said that a forward orientation has an increased risk compared to reverse orientation.

Answer 24

If some has a forward and someone has a backward they don’t recombine; suggesting they have been evolving separately. The more likely underlying factor than individual SNPs are that these SNPs, once inherited, combined with something else in the locus.

Answer 25

It could have been due to segmental duplication: Most people with the protective H2 haplotype are H2D, i.e. have an extra segmental duplication (giving a partial KANSL1 duplication) not found in the reference genome. Perhaps this duplication contains some genetic material that affects neurodegenerative disease risk.

Answer 26

The proportion of H1 people that are H1D is much lower, which could explain why the H1 group has higher risk overall 30% of H1 individuals have a partial KANSL1 duplication (H1D) while over 95% of H2 individuals have a partial KANSL1 duplication (H2D). SNPs in H1 are associated with increased risk while SNPs in H2 individuals are associated with increased risk. Maybe something in this duplication carries something to do with neurodegenerative diseases

Answer 27

1. There is one full length transcription of the KANSL1 gene in all haplotypes. The partial duplication in H1 covers the KANSL1 exon 1, 2 and 3 before being truncated while H2D only covers exon 1 and 2. No duplication: [g >] [< y] [b] [b] H1D: [g >] [< y] [y] [b] [b] H2D: [b] [y >] [< g] [y] [b] 2. KANSL1 genetic duplications also create different fusion transcripts in H1D and H2D

Answer 28

They looked in RNA sequencing data for all sequencing reads containing 3’ end of KANSL1 exon 3 and found that KANSL1 exon 3 fused to the ARL17A/B exon but out of frame. They also looked in RNA sequencing data for all sequencing reads containing 3’ end of KANSL1 exon 2 and found that exon 2 fused to the KANSL1 alternative exon but out of frame BUT it also fused to the LRRC37A3 in frame. They found that a small proportion of H1'-H1' individuals could possess any of these fusions. All H1D-H1D individuals possessed the exon 3 ARL17A/B fusion and none of the others. H2D-H2D individuals all possesed the alternate out of frame fusion and a large proportion possessed the in frame LRRC37A3 fusion and none possessed the ARL17A/B fusion. Therefore the KANSL1 fusion transcripts result from the fusion of KANSL1 exon 2 or 3 with exons of other nearby genes, or novel exons and they are strongly linked to specific 17q21.31 haplotypes (H1D and H2D).

Answer 29

These fusion transcripts are highly expressed, sometimes as highly as the normal full-length KANSL1 mRNA (ARL17A/B (out-of- frame) in H1D-H1D individuals). Buttt the OOF novel exon was 50% of expression and LRRC37A3 was around 10% of expression in H2D-H2D individuals. The potential consequences of these transcripts for protein expression depend on whether the fusion is in-frame or out-of-frame.

Answer 30

These KANSL1 fusion transcripts can also be validated by PCR amplification from RNA extracted from people with the segmental duplication. This was carried out via reverse Transcriptase (RT)-PCR (below) followed by Sanger sequencing.

Answer 31

Long-read sequencing shows us the full exon structure of the fusion transcripts. In the H2D alternate genome assembly, KANSL1 is upstream of LRRC37A3, segmental variation has placed part of KANSL1 upstream of this other gene, and transcriptional readthrough generates the fusion transcript

Answer 32

The next step is expressing these fusion transcripts in neurons and/or organoids, to establish their potential function. From there we can do Rna-seq, ChIP-seq, electrophysiology etc

Answer 33

Increasing genomic complexity can lead to segmental duplications such as those which occurred repeatedly in the 17q21.31 locus; originating from a core duplicon. This can further this complexity and later give rise to new genes and functionality: Partial KANSL1 duplication and fusion transcripts * Protection against neurodegenerative diseases? * Another role in neurodevelopment? This can also give rise to genetic deletions: Due to the high level of sequence similarity in nearby segmental duplications This can also give rise to duplications: 17q21.31 microdeletion / Koolen de Vries syndrome

L4: Segmental Duplications and Fusion Transcripts Flashcards

(58 cards)