Eukaryotic Genome Structure Flashcards
C Value Paradox
Discrepancies between number of genes/size and complexity of genome
Amount of non-coding DNA increases dramatically with organism complexity
C Value
Haploid DNA amount in genome
C0T Analysis
Before the genome sequencing era, the genome complexity could be assessed using renaturation kinetics of single-stranded eukaryotic DNA. Renaturation is a bimolecular reaction in which the reaction rate is directly proportional to the product of the concentrations c of the two homologous strands.
In other words, the product of the initial concentration of single-stranded DNA, c0, and the time required to renature 50% of the DNA, t1/2, is inversely proportional to the rate constant k2 of the renaturation reaction.
This analysis revealed that much of this extra DNA contains repetitive nucleotide sequences. The observed reassociation kinetics reveals the presence of three types of sequences.
This c0t1/2-value (usually simply called theCot-value) is directly proportional to the complexity of the genome (defined as the number of unique sequences in the genome).
Experimental Steps of CoT analysis
- shear DNA to 400 bp
- denature DNA
- slowly cool and sample
- determine % ssDNA at time points
Plot log C0T against %ssDNA to determine the rate of reannealing : determinant of the species genome size/complexity
Repetitive DNA renatures at low C0T values and unique DNA renatures at high levesl
Eukaryotic DNA Elements
- Single copy functional genes
- Repetitive DNA
- Spacer DNA
- 2% coding DNA
Eukaryotic Simple Transcription Unit
- control regions
- cap site
- introns and exons
- polyA site
Functional Repetitive Sequences
Families of coding genes (dispersed vs tandem gene families) and pseudogenes and Non-coding functional sequences
Pseudogene
Once functional ('zombie' gene), role in regulation/transcribed into siRNA Transcription factor binding sites
Multi-gene Families
- Groups of identical or very similar sequences.
- can be tandemly arrayed (Head-to-tail fashion)
- Examples include the tRNA genes (at ~50 sites, containing 10-100 genes), Histone genes in some species.
- Human genome approx. 280 copies of repeat unit containing 28S,5.8S, and 18S rRNA, grouped into five clusters of 50-70 repeats.
- example is B-globin protein
Dispersed Multigene Family
- genes that have become dispersed at several locations in the genome via chromosomal arrangements
- not tandemly repeated
- they may have different functions
Non-functional repetitive sequences
- 65% of the human genome comprises intergenic regions of unknown function
- some repetitive sequences (thousands of tandem repeats) are associated with heterchromatin (non-transcriptionally active)
Transposons
- repeat sequences which have increased in copy umber through transposition
- transposable elements
- transposed via a DNA or RNA intermediate
Simple Sequence DNA
- microsatellites
- repeats < 13 bp
- scattered throughout genome
- 3% of genome
Variable Number Tandem Repeats
- includes minisatellites
- repeat units up to 25 bp in length
- associated with telomeres
Retrotransposons
- resemble retroviruses but only move within a cell rather than between cells
LINEs (long interspersed nuclear elements)
- less frequent but longer
- 1 million copies