How genes and genomes evolve Flashcards
what are several different mechanisms that can alter genes and genomes?
Small mutations (mutation types)
Duplication (chromosomal rearrangements)
Exon shuffling
Rearrangements (chromosomal rearrangements)
Transposition of mobile genetic (transposable) elements
Horizontal transfer
How are most mutations categorized?
Most mutations are categorized into 3 classes
1. Point mutations: small-scale mutations
2. (chromosomal) rearrangements: large-scale mutations
3. Mobile genetic element (transposable element) - induced mutations
copy number variation CNV
large chunks of DNA around 10,000-5,000,000 bases long are inserted, repeated, or lost
how is genetic variation generated?
- In sexually reproducing organisms, only changes to the germ line are passed on to progeny
- Point mutations are caused by failures of the normal mechanisms for copying and repairing DNA
- Mutations can also change the regulation of a gene
- DNA duplications give rise to families of related genes
- Duplication and divergence produced the globin gene family
- Whole-genome duplications have shaped the evolutionary history of many species
- Novel genes can be created by exon shuffling
- The evolution of genomes has been profoundly influenced by mobile genetic elements
- Genes can be exchanged between organisms by horizontal gene transfer
what are zygotes and how do they form?
The gametes (eggs and sperms) contain only half the number of chromosomes than do the other cells in the body (red full circles)
When two gametes come together during fertilization, they form a fertilized egg (aka zygote)
–> the zygote gives rise to BOTH germ-line cells and to somatic cells
In sexually reproducing organisms, only changes to the ___ line are passed onto PROGENY (offspring)
germ (eggs and sperms)
A mutation that arises in a somatic cell affects only the progeny of that particular cell and will NOT be passed onto the organism’s offspring
- Somatic mutations are responsible for most human cancers
how are carbohydrates classified?
Monosaccharides
Disaccharides (2 monosaccharides)
Polysaccharides: compounds of many monosaccharides
what are 3 examples of monosaccharides?
Glucose
Fructose
Galactose
what are 3 examples of disaccharides and what are they composed of?
Maltose = glucose + glucose
Lactose = glucose + galactose
Sucrose = glucose + fructose
what are 3 examples of polysaccharides?
Starch
Glycogen
Fiber
How do point mutations in regulatory DNA sequences of lactase gene affect our ability to digest lactose?
- Our earliest ancestors were lactose intolerant:
- Lactase is made only during infancy
- Adults (no longer exposed to breast milk) do NOT need lactase
–> After around 5 years of age, most people (around 75% world population) stop producing the lactase enzyme
- Lactase gene (LCT) encodes lactase enzyme - Around 10,000 years ago, humans began to get milk from cattle
–> Point mutations in regulatory DNA sequence of the lactase gene express lactase → can digest milk as adults
(aka people who HAVE a point mutation in the lactase gene CAN digest milk as adults)
what are 2 point mutations in the lactase gene LCT that allows adults to digest milk?
C → T point mutation: the 1st identified variant associated with lactase persistence
G → C point mutation: north europe and central africa
what are regulatory DNA sequences and give 2 examples of them
Regulatory DNA sequences: regions of the genome that control the expression of genes (aka they determine when, where, and how much of a gene product (typically a protein) is produced
- don’t encode proteins themselves but contain instructions for turning genes on or off or modulating the level of gene expression
- are crucial for ensuring that each gene is expressed at the right time, in the right cell type, and in the right amounts
promoters and enhancers
promoters
Promoter: a DNA sequence (the binding site) near the transcription start site of the gene at which the RNA polymerase binds to start transcription
Function: to initiate the process of transcription)
TATA box: a type of promoter sequence which specifies to other molecules where transcription begins
- Non-coding DNA sequence
- Is named for its conserved DNA sequence: most commonly TATAAT
In E.Coli: recurring sequence of TATAAT is centered on position -10
- Transcription is initiated at the TATA box in TATA-containing genes
In Eukaryotes: TATA box is the most commonly recognized cis-acting element for genes transcribed by RNA polymerase II on the basis of its consensus sequence
what is the most common form of gene control during development and in different cell types?
regulation of transcription (ie. promoters and enhancers)
enhancers
enhancers: cis-acting elements that have no promoter activity but can stimulate the effectiveness of promoters even when located thousands of nucleotides from the start site of transcription
- do not need to be close to the gene
- can be located upstream or downstream or even in the middle of a transcribed gene it regulates
- When bound by transcription factors, they enhance the transcription of an associated gene (stimulate transcription above basal levels)
- Enhancers operate in conjunction with specific enhancer-binding proteins
horizontal gene transfer
the process by which genes are transferred BETWEEN organisms, often across DIFFERENT species, rather than being passed down from parent to offspring (aka vertical gene transfer)
- So far we have considered genetic changes that take place WITHIN the genome of an individual organism
–> however, genes and other proportions of genomes can be exchanged BETWEEN individuals of DIFFERENT species
horizontal gene transfer difference between eukaryotes and bacteria?
horizontal gene transfer is rare among eukaryotes but common among bacteria
why though??
The cellular complexity of eukaryotes, with their nuclear membrane and tightly regulated gene expression, makes HGT less likely
eukaryotic cells have robust immune and repair mechanisms to detect and remove foreign DNA
gene family
if several genes are STRUCTURALLY or FUNCTIONALLY ANALOGOUS, they collectively form a gene family
Gene family members should be designated by Arabic numbers placed immediately AFTER the gene stem symbol without any space
what is the level of organization for family, subfamily, and superfamily?
Superfamily: a broader grouping of genes
Subfamily: a narrower grouping of genes
Superfamily > family > subfamily
pseudogenes and their characteristics
pseudogenes: duplicated DNA sequences in the alpha-globin and beta-globin gene clusters that are NOT functional genes and do NOT produce a functional protein
- Generally untranscribed and untranslated
- Have a high level of homology to a functioning gene: their DNA sequences are similar to the functional globin genes
- This kind of gene duplication and divergence occur in many other gene families in human genome
aka a DNA sequence that closely resembles that of a functional gene but contains numerous mutations that prevent its proper expression –> most pseudogenes arise from the duplication of a functional gene, followed by the accumulation of damaging mutations in one copy
- Suffix by a “P” (or PS in the specific cases)
Ie. OR2W5P ⇒ olfactory receptor family 2 subfamily W member 5 pseudogene
exon shuffling
Exon shuffling: a process where exons from one gene are added to another gene
–> Leads to a new exon-intron structure → drives the evolution of new genes
Novel genes can be created by exon shuffling
How are duplications made from crossovers?
when crossovers occur unequally and one chromosome may end up with an extra copy of a gene while the other chromosome has a corresponding deletion
It has been proposed that nearly all the proteins encoded by the human genome (around 19,000) arose from the ______ and _____ of a few thousand distinct exons
duplication & exon shuffling
This generates diversity of protein structures
what are crossovers and how do they occur?
crossovers occur when corresponding regions of homologous chromosomes align and swap DNA segments
- each crossover involves double-strand breaks in the DNA, which are then repaired by joining corresponding pieces from each chromosome
- for crossovers to occur, the DNA sequences involved must be highly similar or nearly identical
–> the result is a pair of hybrid chromosomes that each contain segments from the other homolog
- the chromosomes still retain the SAME ORDER of GENES they had initially
what are unequal crossovers and what do they result in?
when a crossover occurs between a pair of identical or very similar short DNA sequences that fall on either side of a gene BUT the short sequences are not aligned properly during recombination –> unequal crossovers
results in…
- 1 long chromosome that has an EXTRA copy of the gene (aka gene duplication)
- 1 shorter chromosome with NO copy of the gene –> chr will eventually be lost
gene duplications via crossovers between homologous chromosomes characteristics?
- Many gene duplications can be generated by homologous recombination
- Homologous recombination can catalyze CROSSOVERS in which 2 chromosomes are broken and joined up to produce hybrid chromosomes
- Crossovers take place only between regions of chromosomes that have NEARLY IDENTICAL DNA sequences (usually occur between homologous chromosomes) and generate hybrid chromosomes in which the ORDER OF GENES is EXACTLY the same as on the original chromosomes
Give a real-life example of a disorder caused by unequal crossovers and describe it
red-green color blindness (aka Daltonism): example of chromosomal duplication
The OPN1LW (red) and OPN1MW (yellow, green) genes are located on the X chromosome
- Both genes have very similar DNA sequences and are closely located
→ Because both genes are very similar in their sequences, UNEQUAL CROSSING over between the two genes can result in different combinations of the genes (ie. duplication) or even hybrid genes
- OPN1LW is thought to have undergone a DUPLICATION event that leads to an extra copy of the gene, which then evolved independently to become OPN1MW
globin gene family
globin gene family: a group of related genes that encode globin proteins
globin protein function and examples
globin proteins: specialized proteins for binding and transporting oxygen in the blood and other tissues
- essential for cellular respiration, as they allow oxygen to be carried efficiently from the lungs to tissues and cells
- ie. hemoglobin and myoglobin
globin gene superfamily in vertebrates
a superfamily of heme-containing globular proteins
what is the simplest globin protein? (amino acid length and found in what organisms)
around 150 amino acids and found in marine worms, insects, and primitive fish
globin proteins in vertebrates structure?
4 globin chains of 2 types (alpha-globin and beta-globin): ⍺2β2 <– hemoglobin structure which consists of 4 globin chains arranged in a tetrameric structure of alpha2beta2
alpha-globin and beta-globin are the result of a gene DUPLICATION
how did the globin gene family originate and describe it
Duplication and divergence produced the globin gene family –> exemplifies how gene duplication and divergence can drive evolution
- The unmistakable similarities in amino acid sequence and structure among present-day globin proteins indicate that ALL the globin genes must derive from a SINGLE ancestral gene
- multiple rounds of gene duplication occurred and thus each duplicated globin gene diverged in function –> giving rise to a variety of globin proteins (ie. alpha globin and beta globin)
- this divergence enabled the specialization of globin proteins
how does exon shuffling affect proteins?
Exon shuffling during evolution can generate proteins with new combinations of protein domains
- These different domains were joined together by EXON SHUFFLING during evolution to create modern-day human proteins
describe the evolution of hemoglobin in vertebrates
- Single-Chain Globin: Early globin proteins were simple, single-chain proteins bound to a heme group (similar in structure to myoglobin)
- Single-chain globins are common in simpler organisms: ie. marine worms and some primitive fish - Gene Duplication and Mutation: Over time, gene DUPLICATION occurred, creating multiple copies of the globin gene –> Mutations in these duplicated genes allowed each copy to DIVERGE in structure and function Results: 2 different types of globin chains (alpha-globin and beta-globin) eventually co-evolved to form a cooperative structure
- Formation of Tetrameric Hemoglobin: The combination of two alpha (α) and two beta (β) globin chains enabled the formation of the tetrameric structure of hemoglobin
- This tetrameric configuration (α2β2) introduced cooperative binding, a feature that allows hemoglobin to efficiently load and unload oxygen depending on the partial pressure of oxygen (pO₂) in different tissues
Mammalian hemoglobin molecule is a complex of 2 alpha-globin and 2 beta-globin chains –> alpha2beta2
Each chain contains a tightly bound heme group that contains a central iron ion and the heme group is responsible for binding oxygen
describe “reconstructing life’s family tree” (aka how mutations and evolution work)
- Beneficial: on rare occasions, the mutation might cause a change for the better
- These mutations will tend to be perpetuated since the organism that inherits these mutations will have a increased likelihood of reproducing itself
- beneficial mutation –> gives selective advantage –> preserved by natural selection and is passed down - Neutral: Mutations that are selectively neutral may or may not persist depending on factors: population size, whether the individual carrying the neutral mutation harbors a favorable mutation located nearby (aka hitchhiking)
- Harmful: Deleterious alternations in a gene that codes for an essential protein or RNA (ie. DNA and RNA polymerases) CANNOT be accommodated so easily
deleterious alternations are harmful –> typically eliminated from population through natural selection - A segment of DNA that does NOT code for protein or RNA and has no significant regulatory role is free to change at a rate limited only by the random mutation frequency
difference in mutation rates in non-coding DNA vs highly conserved genes
ing DNA: Segments of DNA that neither code for proteins nor have regulatory functions are free to accumulate mutations with few consequences
These regions evolve faster since mutations are not strongly selected for or against, allowing changes to accumulate at the natural rate of random mutation (aka random mutation frequency)
Highly Conserved Genes: For essential genes, mutations that disrupt function are usually deleterious, leading to a slower mutation rate in these regions over time. Since any significant alterations are typically eliminated, the sequences of essential genes remain highly similar (conserved) across diverse species, indicating their crucial roles.
what are essential genes and their conservation?
Genes that codes for an essential protein or RNA (ie. DNA and RNA polymerases)
These essential genes are highly conserved: the products they encode (RNA or protein) are very similar from organism to organism
deleterious alternations to essential genes cannot be accommodated so easily –> the faulty organisms will almost always be eliminated or fail to reproduce so that these harmful mutations are lost