L2: Comparative Genomics Flashcards
Is it correct to assert that more genes leads to more diversity/complexity?
Gene number is not correlated to organism complexity: Human and c. Elegans have around the same number of genes (~20k). Sea urchins have ~23k while rice has ~50k. Some species have also undergone full-genome duplications!
Describe an example of three or four genes with high sequence and functional similarity in mammalian genomes
Drosophila contain one Notch receptor (dNotch) that is bound by two transmembrane DSL-ligands (Delta and Serrate).
Mammalians possess four Notch receptors (Notch1–4) and five ligands (Jagged1 and 2, which are homologous to Serrate, and Delta-like (Dll) 1, 3 and 4, which are homologous to Delta).
What is notch important for and how conserved are the genes encoding it?
Notch important for stemness and differentiation- the domains are conserved in the distant homolog in drosophila. They have been highly conserved over millions of years of evolution both ways. There are clear structural changes in both the protein binding domains and ligand binding domains in terms of the length.
How may these changes in Notch have arisen?
This can arise from segmental duplication before diversifying. Quite dramatic events where the genome didn’t separate correctly during mitosis. Hypothesised to have been during times in which there was a great diversification.
What do Notch, serrate, jagged and delta genes encode for ? (More detail)
Notch receptors are expressed on the cell surface as heterodimeric proteins. Their extracellular portion contains 29–36 epidermal growth factor (EGF)-like repeats that are associated with ligand binding, followed by three cysteine-rich LIN repeats that prevent ligand-independent signalling, and a heterodimerization domain.
The intracellular portion of the receptor harbours two protein interaction domains, the RAM domain (R) and six ankyrin repeats (ANK), two nuclear localizations signals (NLS) and a transactivation domain (TAD, which has not yet been defined for Notch3 and 4), and a PEST (P) sequence.
Notch ligands are also expressed as membrane-bound proteins. They all contain an amino-terminal DSL domain (Delta, Serrate and Lag2) followed by EGF-like repeats. Ligands of the Serrate family also harbour a cysteine-rich (CR) domain downstream of the EGF-like repeats.
What significant event happened twice to the vertebrate genome in the last 500 million years?
The vertebrate genome duplicated twice in the last 500 million years:
It has been proposed that more than 450 million years ago, two successive whole genome duplications took place in a marine chordate lineage before leading to the common ancestor of vertebrates. A pre-vertebrate genome composed of 17 chromosomes duplicated to 34 chromosomes and was subject to seven chromosome fusions before duplicating again into 54 chromosomes.
What remnants of these duplication events exist in our own genome? What is the relevance of this for evolution?
- Most genes have multiple paralogs
- Gene Paralogs could have specialised functions or specialised expression
patterns - The events of genome duplication are at the onset of explosions of diversity
What is the special case of Xenopus?
Species like Xenopus Laevi (claw-frog) have undergone addional genome duplications
How could this special case of Xenopus arose?
This could have arose when an ancestral species S and an ancestrous species L which individually had diploid homolougous chromosomes but did not have homology with each other. This means that they were sterile with respect to each other and could not produce fertile offspring.
Mating between the two species would then result in egg cells with hybrid S/L haploids which cannot produce fertile offspring. This could happen a bunch when in the same pond, however a rare event may have caused the haploid chromosomes to replicate to produce homeologous chromosomes (duplicated genes or chromosomes that are derived from different parental species and are related by ancestry). This allotetraploid offspring would then be fertile, can produce sperm, eggs and offspring. Through research they found that chromosomes could sometimes duplicate sporadically and were no longer sterile.
How is this chromosome replication relevant to us?
Sometimes this can happen in cells in the body without much disruption. It might be better to have this than only some stuff duplicated; we know what happens with an extra chromosome 21; maybe the balance is restored with a complete new set.
What is often inferred from conserved regions of the genome?
Exons are particularly well conserved, but there are also super well conserved non-coding sequences throughout the evolution of vertebrates. The idea is that if they are so well conserved, they must be important. Conservation is a predictor of function
What are meant by ultraconserved elements?
Ultraconserved elements (UCE) are
>200 bp sequences which have not
changed in > 100 Million years
These ultraconserved elements of the human genome are most often located either overlapping exons in genes involved in RNA processing or in introns or nearby genes involved in the regulation of transcription and development. They are more highly conserved between these species than are proteins.
What are many UCEs part of according to Jacobs? What is a surprising finding regarding UCEs?
- Many UCEs are part of (neuronal)
enhancers - Enhancers are modulators of the main switch, the promoter. - Many UCEs are not essential for
normal development
What are meant by human accelerated regions?
In some cases these super conserved regions undergo changes in humans. These are known as human accelerated regions and are often non-coding elements. Elements that have stayed conserved for a long time (= function), which have rapidly changed between chimp and human. Hundreds of Human Accelerated Regions (HARs) have been identified.
What functions do many HARs play?
Many HARs are part of enhancers / gene regulatory elements. Some HARs show differential regulation between different species-configurations
Give an example of a human accelerated region (2)
HAR1 forms a brain-expressed non-coding RNA molecule with a
human-specific secondary structure, yet with no known function
HAR2 is a developmental enhancer near GBX2 which is expressed in muscles of your hand, particularly your thumb. They found that there were 16 human specific mutations in the HAR2 which regulates the expression of this thumb gene.
What have studies on HAR2 in chimps and monkeys shown?
Chimp and monkey versions of HAR2
do not regulate the thumb-gene
HAR2
Give examples of how large scale genomic variation between human and
chimpanzee genome can look?
There are deletions and insertions and well as double breakpoint inversions, pericentric inversion and single breakpoint inversions across the genome.
What are meant by inversions?
Inversions are generated when two double-stranded breaks are introduced into the chromosome and are rejoined such that the gene order of sequence between the breakpoints is reversed.
What are considered to be the two types of inversions
There are two types of inversion, paracentric and pericentric, with the difference being whether the centromere is involved in the rearrangement. A pericentric inversion includes the centromere in the inverted segment, while a paracentric inversion does not.
What two processes can generate chromosomal inversions?
Ectopic recombination and staggered breaks.
Ectopic recombination generates inversions via a recombination event between two homologous sequences (often transposable elements) oriented head-to-head along a chromosome. The homologous sequences recombine and are reintegrated into the genome so that the strand is positioned in the opposite direction.
Staggered breaks can generate inversions via the complete detachment of a DNA segment from a chromosome and its subsequent reattachment in the opposite orientation at the same position in the chromosome
What is meant by the term ‘staggered’ breaks?
Such breaks are usually staggered, meaning that they will result in fragments with stretches of single stranded DNA at their extremities. The DNA repair mechanism that synthesises the reverse complement of these single stranded stretches then joins the DNA sequences back together in a non-homologous way, sometimes reinserting the DNA segment in the opposite orientation, creating an inversion.
What happens when breakpoints are heavily staggered?
When the breaks are heavily staggered (i.e., with long single strand stretches), this results in inversions with duplicated sequences at both breakpoints in the derived sequence. In contrast, when the breaks are blunt or slightly staggered, cut-and-paste type breakpoints are created in the derived sequence, with no or small duplications, respectively.
What are the functional implications of these inversions?
When segments are flipped this might not matter that much for genes are in the middle, but it might matter for those at the end where they are brought into closer contact with other genes.
What is missing from these large scale variation maps?
Single nucleotide polymorphisms not on the map
To what extent are SNPs present in human vs chimp genome?
In ~6 million years, after the split from the last common ancestor of human and chimp, about 1% of bases is substituted.- 35 million bp differences between human and chimp
How could these SNPs occur?
DNA damage repair can lead to nucleotide substitutions