Variation in genome size/ Non-coding DNA Flashcards
The C value paradox
C value: DNA within nucleus
The lack of correlation between biological complexity and genome size (DNA in the nucleus) in eukaryotes
Single felled simple Amoebae genome is 100x that of human genome
Huge variation in eukaryotic genomes which cannot be explained by variation in the number of coding genes.
Eukaryotes also carry a lot of non-coding DNA
Why carry so much non-coding DNA?
1) Essential functions eg. regulation of gene expression
2) Junk linked to functional genes so can’t be lost
3) Structural or nucleoskeletal function
4) Functionless “parasite” / selfish DNA
Hypothesis for genome size variation (explanations for C value paradox)
Genome size and phenotypic traits:
- Genome size correlated with a variety of phenotypic traits (e.g. size of nucleus, duration of mitosis/ meiosis, metabolic rate in birds)
- However, correlation does not mean causation (genome size also varies with body size)
Genome size and cell volume:
- Amount of genic DNA is related to complexity but amount of secondary DNA increases proportionally with cell volume.
- Genome size varies with cell volume as DNA mass + folding pattern determine nuclear volume.
- Genome size determines the size of the nucleus and there must be a constant ratio of nucleus to cell volume to allow balanced growth during cell cycle -> maintain balance between RNA synthesis in nucleus and proteins in cytoplasm)
- Therefore to have larger cells you must have a larger genome.
- evidence: algal nucleomorphs have undergone 200-1000 fold reduction in genome size and have no secondary DNA as don’t need to maintain structure.
- Secondary DNA absent in prokaryotes that lack nuclear envelopes
- However, variation in cell size within organisms
Effective population size:
- Study suggested that the effective population size determined whether a species could effectively remove non-coding DNA from the genome and therefore determines genome size.
- Small populations: Non-coding DNA cannot be purged leading to complex organisms (Eukaryotes). Fast replication not essential.
- Large populations: selection against non-coding DNA acts as a barrier for evolution of complexity. Selection is strong to eliminate it. Bacteria must replicate quickly.
- Reliability of this paper is questioned. -> paper treats species as independent data points
Parasitic DNA:
- Genome size varies as a result of the rate of aqcuitision of DNA and the ability to excise it.
- Micro satellites, minisatelites, TE
Polyploidisation and TE:
- Genome size varies in plants due to polyploidization and TE.
- bacteria reproduce sexually therefor polyploidy is less common.
- following WGD selection on coding DNA but less selection on non- coding
- TE more common in non coding DNA
Types of Repetitive non-coding DNA
Tandemly repeated DNA
- Satellite DNA (60% of genome in some drosophila species)
- Microsatellite DNA
- minisatelllite DNA
Transposable elements (42.5% of genome in humans)
- Class I: Retro transposones (Reverse transcriptase)
-> LTR
-> Non- LTR (poly A tail)
——-> LINES
——-> SINES (use RT of LINES and are the most common)
- Class II: DNA transposones
Gene gain and loss due to transposable elements
Transposable elements can accumulate very quickly
Example: In maize, it was been estimated that 23 retrotransposons (~160 kb) all appeared within a 240 kb genome region around the adh1 gene within the last 6 million years. If adh1 is typical, then the maize genome size could have increased in size by ~50% over this period.
Example: specific tunicate of larvacean class lacks many TE classes. Compared to other species and found genome was on average 12x smaller.
Species have mechanisms to excise selfish DNA and the rate at which they do this determines spread of TE
TE could increase by 20-100 copies in 1 generation
Transposable element example: P element in drosophila
They are a class II element
P elements entered populations of D.melanogaster once lab strain had already been established.
Lab strains: No P element
Wild strains: P element
The P element can cause increased infertility due to chromosome breakage when females from lab strains and males from the wild are crossed.
This is an exmaple of how TE can result in speciation due to reproductive isolation.
Transposable element example: Human endogenous retro virus (HERVS)
HERVS have 2 core genes and an envelope gene -> hervs are viruses that have found their way into the human germ line
Exogeneous life cycle:
Envelope gene allows them to exit the cell and infect other cells
Endogenous life cycle:
They are able to infect germ line cells and be passed on through the generations
Concequences of ERV activity and other TE.
1) Disease:
- Hervs can cause disease due to the genes that they encode.
- Only lab evidence that they cause disease in humans (Herd-K coding for Rec can cause tumours)
2) Recombination
- Non-homologous recombination can occur between TEs as they are similair and act as anchor points.
- Most large scale rearrangements/ ecotopic exchanges will be deleterious so selection limits TE and means TE are concentrated in non-recombining regions. (Ectopic exchange hypothesis)
- However, there are complex interactions and evidence does not show that TE is conctrated in non-recombing regions.
3) Co-option
- Hervs can be co-opted to serve a host function
- Example: syncytin gene from retorviral insersion is involved in placental morphogenesis in mophs (stops rejection of fetus)
Functional non-coding DNA
5% of human genome is conserved yet only 1.5% is coding.
THis is evidence that it is not just coding DNA that has a function.
By studying genomes to see which regions have accelarated evo in 1 species compared to others you can identify non-coding DNA with a function
HAR: human accelerated regions
- Al but 2 outside coding region
Example: HAR1 has accelerated evo. It is expressed in the developing human neocortex during a crucial period of neuron specification. HAR1 may be responsible for the unique cognitive ability of humans.
ENCODE project
“The ENCODE project aims to delineate all functional elements encoded in the human genome” (coding and non-coding)
Functional elements = discrete genome segments encoding defined product / biochemical signature
Results:
- 80.4% of the human genome is functional (participates in at least 1 biochemical event in at least 1 cell)
Overview
Eukaryotic genomes vart greatly with no correlation to complexity leading to the C-paradox.
Explanations
- Structural hypothesis (cell volume/ genome size)
- Parasitic DNA gain vs loss (TE)
- Cell size and phenotypic correlation
- Polyploidy and TE
- Varying effective population size
Why is there so much non-coding DNA?
- Essential functions (e.g. HAR-1)- Encode
- structural function
- Pasively transferred junk
- Selective battle
Transposable elements
- The rate of accumulation of TE and the excision effect genome size. (Example maize)
- TE examples
-> P elements in Drosophila
-> HERVS
Consequences of HERVs and other TE
1) Disease
2) Recombination
- As a result TE are limited and found in region of low meiotic recombination.
- Less Ectopic recombination in mammals therefore more TEs possible as less selection against them.
3) Co-option (e.g. moths)
Genome balance hypothesis
Non coding DNA is needed to maintain a balance between the levels of gene expression and gene regulation
Transposing position balances the expression of networked genes (contain regularly elements)
Transposing position important for centromere movement (involved in microtubule formation)
Evidence: when makes allotetrapod deletes redundant genes, near by transposing are also lost suggesting they have a function