ISALAN - genomic repeats Flashcards
Bacterial genome structure
- very well organized into operons – all genes for a particular function are lined up together (ex. lac operon or genes involved in RNA synthesis)
- highly evolved due to a short life cycle which allows it to go through many more generations (more cycles of optimization & natural selection) than a eukaryote
- size: 3*106 bases in E. coli and has 4k-5k protein coding genes
Eukaryotic genomes are:
- less organized
- larger size
- split and organized into chromosomes - studied by looking at chromosomes in metaphase which are condensed, duplicated chromosomes that have sister chromatids
> Can be arranged and looked at in karyograms
> G-banding (Tryptic digest followed by Giemsa stain on each chromosome) is performed. Results show:
-Dark bands = AT rich region
-Light bands = GC rich region
-Unique patterns of bands on each chromosome allow us to identify chromosomes & diagnose chromosomal rearrangements
Ex. Human genome
- size: 3109 bases
BUT small amount of protein coding genes: 20k genes & estimates keep falling
o can be because:
-small genes are hard to locate
-microRNA’s = functional genes but do not code for proteins
-non-coding repeats = at least 50% of genome
o Still a complex organism because alternative splicing mechanism allows for protein diversity (many functional protein products from a single human gene)
organized into 23 pairs of chromosomes*
huge variety in organization of genomes
chromosome number = not correlated with genome size or complexity of an organism
* Amoeba 100,000 Mbp
* Salamander species: ~30,000 Mbp, yet only 14 chromosomes.
* Human: 3200 Mbp human (haploid), 46 chromosomes (diploid)
* Yeast: Haploid S. cerevisiae has 12Mbp, 16 linear chromosomes (but size of genome usually correlates with cell size)
Other structures of DNA:
- B chromosomes are extra small chromosomes that occur in many organisms. They can originate from autosomes and sex chromosomes in intra- and interspecies crosses.
- Holocentric chromosomes – the entire chromosome acts as a centromere. Best known (example: C. elegans)
- Extrachromosomal DNA – DNA that are found off the chromosomes ex. Plasmids in
yeast, organelle DNA in mitochondria & chloroplast – carry out a limited amount of gene expression important for function of organelle
Organelle genomes
Mitochondria and Chloroplast have their own DNA due to endosymbiosis
* In ancient history, eukaryotes developed a mutualism symbiosis with prokaryotes which are precursors of mitochondria & chloroplast. The relationship was essential for the survival of eukaryotes because they provide energy via ATP synthesis and are involved in photosynthesis. Because of the cruciality of those prokaryotes, most of their genes have migrated into the nuclear genome of the eukaryotes, with some leftover DNA in the organelles themselves.
Mitochondrial DNA (all eukaryotes)
* Usually closed circular DNA (rarely linear, ex. Chlamydomonas)
* In humans, mitochondrial DNA size = 16,569, only contain 37 genes, coding for 13 protein ORFs
Chloroplast genome (oxygenic phototrophs)
* Single closed circular DNA (rare exceptions).
* Typically 120 – 170 kbp, codes around 100 proteins, mainly to do with maintenance and photosynthesis.
Gene distribution in eukaryotic chromosomes:
- Uneven – gene-rich regions alternating with gene desserts
o Genes are less dense around centromeres & telomere (mostly contains particular repeats) - Overall organization of genes does differ between eukaryotes which can provide information reflecting evolutionary histories of different organisms
- Gene Density is generally lower in more “complex” eukaryotes (subjective) which could be because:
o The existence of introns that split up the genes
-“simpler” eukaryotes have fewer introns ex. yeast. BUT NO eukaryote has NO introns. (but could also be because yeast’s shorter life cycle that has allowed it to better optimize its genome structure)
o A bigger genome and more repeats which allow for more chance of recombination events, contributing to complex body plans
Gene distribution in Human genome
- Some parts are Gene-Rich:
o Ex. on chromosome 6, contains MHC genes which are 60 genes and 1 pseudogene within a small 700kb region. Also has very high GC content (54%) - Most parts are Gene-Deserts (defined as at least a 1 Mb region with no genes)
o 82 deserts identified (occupying 144 Mb, 3% of genome)
o 25% of genome also has 500,000 kb regions with no genes.
o Gene deserts might be incorrectly identified because functions of those sequence are unknown:
-could contain regulatory regions ex. distal enhancers or
-could contain larger genes ex. Dystrophin gene (spans 2.3Mb)
-Slower synthesis of larger genes also make it difficult to obtain a fulllength cDNA, so the region is incorrectly identified as no gene present - 62% of genome contains intergenic regions (region in between genes)
o Can be classified into:
-Unique intergenic region
-Repeated intergenic region
Repeated intergenic regions
- Mostly GC rich
- discovered by performing cesium chloride density gradient centrifugation on DNA -
separates DNA according to size, resulting in a main genomic band with satellite bands
can be classified into:
* Tandemly repeated DNA
* Interspersed repeats (Genome-wide repeats)
Tandemly repeated DNA
(mostly associated with structural features – centromeres, telomeres)
1. Satellite DNA
* Usually found in heterochromatin in centromeres
* Ex of families:
o ⍺-satellite family (171 bp repeated unit)
o Β-satellite family (68 bp units interspersed with 3.3 kb repeats)
Shorter tandem repeats:
2. Minisatellites
* 10-100 bp repeats that form clusters up to 20 kbp
ex. centromeres
- Microsatellites
* <13 bp repeats that form clusters <150 bp
* can be called ‘Simple Tandem Repeats’/’Simple Sequence DNA’
Ex. human telomeric repeats (5’-TTAGGG-3’)
* Common form: Dinucleotide repeats
o 140,000 versions of CACACACACACACACACACACACA on chromosome 12
o 120,000 copies of AAAAA repeats
Modes of satellite DNAs variation:
These short repeats are unstable, so there can be:
- Unequal crossing over during homologous recombination
* HR occurs in prophase1 of meiosis
* Due to a repeated pattern, the sequences might not line up perfectly (from end-to-end), resulting in an unequal crossing over, so one strand will get insertion of the repeats while the other will get deletion. - DNA polymerase slippage
* During replication, the ss daughter DNA can slip back 1 repeating unit due to hairpin formation of repeated sequences.
* DNA polymerase will not be able to notice and will continue replicating, resulting in the addition of the repeat.
o Each slippage event leads to 1 unit added
* Usually insertion (deletion if the hairpin formation occurs on the template strand)
These all cause variation in the number of repeats between individuals or generation of cells within an individual, thus they are also known as variable number tandem repeats – VNTRs.
* VNTRs are useful in DNA-fingerprinting to identify potential suspects in crime scenes.
Uncontrolled number of repeats can result in diseases ex:
- Cancer due to telomeric repeats
- Number of telomeric repeats set the survival limit of a human cell line. As a human ages and their cells undergo more and more cell division/ DNA replication, the end telomeric repeats get shorter.
-With cancer cells, telomerases allow uncontrolled expansion of the telomeric ends, making the cells “immortal” - Huntington’s disease due to GAC repeats
-GAC codes for glutamine
-Too much GAC repeats = too much glutamine production = aggregation in the brain
Interspersed repeats (Genome-wide repeats)
- Transposable elements – able to move around the genome
Separated into:
* DNA transposons
* Retrotransposons (LINES, SINES, LTR retrotransposons (viral-like))
DNA transposons
- moves around in the form of DNA (w/out RNA intermediate)
- Not sequence specific (differ from recombinases, integrases) – randomly target DNA
- Mode of transposition:
-DNA transposon codes for transposases
-Transposase generates sticky-end cuts in the target DNA
-Transposase also mobilizes the DNA transposon to be inserted into target DNA
-Gaps are then filled by DNA polymerase to repair any damage
-Transposons are mutagenic – cause genetic variation and mutations within the gene they are inserted into – plays a part in evolution
-Because of the generation of sticky ends, the DNA transposon is flanked by direct repeats, which can drive its backward excision process - Ex. the mariner transposon
-14,000 copies in Human genome (2.6 million base pairs)
-14% of all insect species carry mariner – means that transposition could have occurred since ~50 million years ago - First discovered in Maize by observing the difference in pigments across kernels within 1 plant, or even small spots of pigments within 1 kernel.
-The difference is caused by the presence of a transposon within the pigment gene in some cells, resulting in no expression of pigment (white kernels).
-If its transposase is activated, the transposon will be moved out of the pigment gene, resulting in expression of pigment (purple)
Retrotransposons
- moves around the genome via an RNA intermediate