Evolution Exam 3 Flashcards
Calculating mutation rate for neutral mutations
If u refers to rate of mutation per gene per generation, for a diploid population size Ne, the number of new mutations per generation is 2Neu
Probability of fixation
This probability is equal to its frequency. This means for a newly arisen allele this probability is 1/(2Ne)
Number of mutations that arise and eventually get fixed is (2Neu)(1/[2Ne]) = u. Under the neutral theory the rate of fixation equals the mutation rate for neutral mutations
Implications of the neutral theory of mutation
Mutation rate does not depend on population size. A larger population size indicates more mutants but also a lower likelihood that any mutation is fixed which balances it out
Often expect to find polymorphisms floating neutrally within species since the time to fixation by drift is 4Ne generations
Problem with neutral theory
Neutral theory looks at the rate of mutation for a gamete per generation not per absolute time but the molecular clock seems to follow absolute time even though species vary widely in generation time. This begs the question of why aren’t more mutations accumulating in lineages with short generation time?
How does the nearly neutral model resolve why more mutations do not accumulate in lineages with a shorter generation time?
If most fixed mutations are instead slightly deleterious instead of strictly neutral, the probability of drifting to fixation will depend on population size
In a small population, drift overrides weak selection so most mutations are evolving as if they are neutral, so they are effectively neutral. Effectively neutral mutations have a selection coefficient of 0. Mathematically mutations are effectively neutral if 2Ne is less than or equal to 1/s.
In a large population, drift is weaker than selection so most mutations are not neutral and are selected against
Species with short generation times tend to have a larger population size. This results in many mutations per year but fewer mutations that are effectively neutral
Species with long generations tend to have smaller populations, these populations have fewer mutations per year but a higher proportion of them can act as if they are effectively neutral
dN/dS ratios
dN refers to the rate of nonsynonymous substitutions per site and dS refers to the rate of synonymous substitutions per site. Both are measured as a proportion of sites that are polymorphic
dN/dS < 1 indicates natural selection is likely eliminating nonsynonymous substitutions, strong purifying selection
dN/dS = 1 indicates replacements are neutral and there is little functional constraint, pseudogenes or weaker purifying selection
dN/dS > 1 indicates replacements are advantageous and favored by selection, positive selection
MHC
plays an important role in antigen protection, they have ARS receptors or antigen recognition sites which is part of this process
At the antigen recognition site dN/dS: 3.8 showing positive selection where amino acid changes at the antigen recognition site are favored since there is an arms race with co-evolving pathogens
At other protein domains dN/dS: 0.64 which indicates purifying selection, typical
How can dN/dS be used to detect pseudogene evolution?
Regions with relatively higher dN/dS ratios are more likely to be pseudogenes and can be identified from loss of function in phenotypes
Pseudogene
segment of DNA that structurally resembles a gene but does not encode for a functional protein
McDonald-Kreitman (MK) test
Posits that the number of polymorphic sites within a species should be directly proportional to the number of differences that become fixed between that species and a sister species according to the neutral theory
Within species the proportion of nonsynonymous and synonymous mutations can be used to identify this, between species the ratio of fixed differences for synonymous and nonsynonymous between species should be examined to indicate whether or not selection is driving amino acid change
Determined Adh gene had a high ratio of non-synonymous fixed differences between D. melanogaster and D. simulans, this is interpreted to indicate that selection favors amino acid changes that differ between those two species at that gene
Helpful for detecting past positive selection that is no longer detectable using dN and dS ratios within a species
Tests of selection on DNA sequences
Using neutral or nearly neutral evolution as a null hypothesis, we can assume most genes across a genome are evolving in accordance with the nearly neutral model
Look for specific loci that deviate from the neutral or nearly neutral null hypothesis, these are genes where natural selection is favoring or disfavoring specific alleles
Researchers overall look for patterns of evolution that do not fit evolution by genetic drift and also background purifying selection alone
Three methods of testing selection
dN/dS ratios, McDonald-Kreitman (MK) test, Haplotype tree shape
How can haplotype tree shape be used to test for selection?
They can test whether there is an excess of old or new polymorphisms compared to neutral expectations. This is because under neutral evolution, a mixture of closely related haplotypes are more likely to be recently diverged from an ancestral haplotype and those that are more distantly related are more diverged
How can positive selection and balancing selection be identified when haplotype trees cannot be constructed?
As a proxy, researchers can look at the frequencies of polymorphisms. Tip branches indicate low-frequency or rare polymorphisms (higher sigma) and internal branches indicate higher frequency polymorphisms (higher pi)
What does the S locus allele tell us about Negative FDS?
S alleles have been maintained by negative FDS for so long that the same allele may be present in different species and even completely different genera (tomatoes, potatoes, chiles, etc). Haplotype sharing can be caused by incomplete lineage sorting and gene flow, but in this case it is caused by balancing selection and show no correspondence to genus-level phylogeny
Ways to statistically compare the frequencies of rare vs high-frequency polymorphisms
Pi = average number of sites that differ between two randomly chosen sequences, calculated as =2pipjij. This value is dependent on the frequencies of polymorphisms as well as the nucleotide difference
This is compared to sigma which refers to the total number of polymorphic sites in a gene, which is not sensitive to the frequencies of polymorphisms
Under neutrality pi = sigma , under positive selection pi < sigma since an excess of rare polymorphisms is present which indicates a shallower tree, and under balancing selection pi > sigma since there is an excess of high frequency polymorphisms
Tajima’s D
refers to the statistic that compares the values of and w and can be calculated as pi - sigma, so D = 0 indicates neutral evolution, D < 0 indicates positive selection, and D > 0 indicates balancing selection
Problems with Tajima’s D
Sensitive to population size fluctuations (demographic effects) and population structure. Recent population expansion creates D < 0 since new mutations can arise on different copies in a haplotype. In this case, a neutrally evolving gene can present as being positively selected for since excess low-frequency polymorphisms are present at a neutrally evolving gene
It is also sensitive to population size fluctuations and population structure since long internal branches between isolated and diverging populations create a D > 0 for a neutrally evolving gene since genetic drift can lead to loci becoming differentiated as populations become isolated.
To identify selection, Tajima’s D should be calculated for multiple loci since demographic effects affect all genes but selection is gene specific
Detecting positive selection in the genome
For a favored mutation, positive selection will lead to a selective sweep which is a genomic region with low nucleotide diversity and a large block of linkage disequilibrium, when a large part of a genome is universally selected for. One example is rice domestication where humans selected for mutations in the Waxy gene that controlled starch synthesis
AZT
Drug treatment that was used early in the HIV/AIDS epidemic that blocks reverse transcriptase so it mimics T but stops polymerization. However, there was a rapid evolution of resistance since it developed over time and viral growth in turn steadily decreased over 20 months
Resistance develops rapidly since amino acid changes that confer resistance occur in the binding domain of reverse transcriptase and this altered conformation prevents reverse transcriptase from binding AZT effectively
HIV Life Cycle Steps
HIV first has a virion or extracellular stage
The gp120 protein on the surface binds to CD4 along with the coreceptor on the host cell, one of these coreceptors is the CCR5 protein
HIV’s RNA genome, reverse transcriptase, integrase, and protease then enter the host cell
Reverse transcriptase synthesizes HIV DNA from the HIV RNA template
Integrase splices HIV DNA into host genomes and the HIV DNA is transcribed into HIV mRNA by the RNA polymerase of the host cell
HIV mRNA is translated by HIV precursor proteins is then translated by ribosomes of host cell and the protease cleaves the precursors into mature viral proteins
A new generation of virions assembles in the host cell
New virions then bud from the host cell membrane
Evolution of HIV
Refers to the immune system driven rapid evolution within patients where long branches indicate positive selection
Each shaded color corresponds to a different patient along with the non-neutral evolution within patients
Transmission between hosts also correlates with founder events and genetic drift
Why are drug cocktails more effective than single drug treatments?
Drug cocktails target multiple mechanisms and there are multiple classes (fusion inhibitors that block virus surface proteins, reverse transcriptase inhibitors, protease inhibitors, integrase inhibitors)
They are more effective since HIV resistance would require simultaneous mutations that confer resistance against multiple drugs which is a lot more unlikely
But even multi-drug treatments can lose their effectiveness after around 3 years because of side effects where people go off drugs and a dorman virion reservoir within the body
Evolutionary origins of HIV
Major strains are HIV1 (epidemic form that is transferred from chimps via bush meat where wild animals are hunted) and HIV2 which is primarily in West Africa and less virulent, transferred from sooty mangabey
Gp120 provides better phylogenetic resolution for a chimp and human clade
Zoonotic disease
Transmissible to humans from other animals
Molecular clock estimate
To estimate this, first an unrooted tree shows the genetic distances among HIV-1 M strains collected from the 1980s-90s. A plot of divergence from an inferred common ancestor was then determined and a 95% confidence interval showed a common ancestor from 1915-1941 was likely present
Natural resistance to HIV
32 bp deletion in the CCR5 coreceptor mutation proved to be resistant to HIV and therefore underwent positive selection. This was also possibly selected in favor for resistance against bubonic plague or smallpox, but it is highly present in European populations and mostly absent in Asia and Africa. It does precede HIV epidemic and CCR5-32 del is also linked to COVID resistance but homozygotes had an increased susceptibility to West Nile
Timothy Brown
Patient with leukemia and also infected with HIV, he had 2 bone marrow transplants in 2008 and 2009 where the donor was homozygous for CCR5-32 deletion and his HIVI infection was cleared after transplants. Another patient was also cured by the same method
HIV-controllers
when patients become infected but are essentially asymptomatic, HIV controllers have an MHC protein configuration that allows only low-fitness HIV to escape MHC detection and this results in HIV viral DNA being present in non-transcribed (heterochromatic) regions
Influenza A
contains 8 RNA strands and encodes for 11 proteins, many of these strands can be exchanged with one another which contributes to evolution. Two major coat proteins are neuraminidase and hemagglutinin (major protein recognized by the immune system)
Antigenic sites
protein regions that are recognized by the human immune system
Strain numbers
broad groupings of protein regions based on human antibody recognition, example is H1N1
How does the ARS domain of human MHC (the HLA locus) show evidence of positive selection?
The antigen recognition site at the MHC locus had a significantly higher dN/dS ratio than codons on other protein domains. This indicates an arms race with co-evolving pathogens favored amino acid changes at the ARS domain while the rest of the protein showed neutral purifying selection
Hemagglutinin gene evolution
fits the molecular clock and has neutral evolution overall but phylogeny indicates there is a differential survival among strains which can be mapped by natural selection.
What evidence can be used to show the differential survival among influenza strains?
The phylogenetic tree indicates there is a differential survival among strains even when certain genes such as the Hemagglutinin gene fits the molecular clock. This is indicative of a cactus phylogeny where most strains go extinct and then a single strain founds the next epidemic. Positive selection at 18 antigenic codons are linked to differential survival and the dN/dS ratio is > 10 at those codon sites compared to the rest of the genome
How can the antigenic codons be used to predict which strains will have descendants for the flu vaccine design?
Mutations at the 18 key antigenic codons can be used to predict which strains will have descendants for a flu vaccine design since the flu virus survivor is predicted to be the strain with the most replacements at the 18 key antigenic sites. However, this method is not predictive of novel pandemic strains since those strains involve major changes in different groups (H and N groups)
Speciation
Process by which a single cohesive gene pool, or evolutionary lineage, can diverge into independent evolutionary lineages
Species concepts
Attempts to define criteria for recognizing a separate species, three common types are morphological, phylogenetic, and biological
Morphological species concept
trait-based, this is simple and the default criteria in most cases but some disadvantages are that they are non-evolutionary, they can fail to recognize cryptic species, and they can be incorrectly split based on intraspecific polymorphisms
Phylogenetic species concept
When distinct species are recognized by the reciprocal monophyly of haplotypes. Some advantages are that phylogeny can provide proof of distinct evolutionary lineages but some disadvantages are that since phylogeny is used, this can be confounded by incomplete lineage sorting, homoplasies, or other types of phylogenetic ambiguities
Biological species concept
when speciation occurs from reproductive isolation within a population. Some strengths are that independent lineages are defined by evolutionary processes rather than trait patterns (either through interbreeding, a lack of gene flow between species, reproductive isolating barriers), but one big weakness is that interbreeding cannot always be used as a criteria for speciation since sexual reproduction is not always common
How can Prezygotic Premating RIBS arise (mating does not occur and gametes do not come together)?
Premating is temporal and the phenology or mating season between flowers is different. Premating can also be spatial where there is no overlap in species distribution. Behavioral premating such as sexually selected traits in animals and types of pollinators in plants can lead to mating not occurring.
How can Prezygotic Postmating RIBS arise (mating occurs but gametes do not come together)?
Morphological incompatibility of genitalia can arise (especially in insects). One example is the structure of the binding protein on sperm and proteins on egg determine compatibility, these can differ among species
Pollen and sperm precedence can develop where germinating pollen from the same species are more efficient and reach the ovaries first so zygotes between species are not produced, interspecies mating at the gamete level is selected against
How can Postzygotic Intrinsic RIBS develop?
Hybrid inviability and sterility that reflects different genetic makeup (aborted embryos, inviable or sterile offspring)
How can Postzygotic Extrinsic RIBS develop?
Hybrids have low viability and/or mating success due to selection against them in their environment. Documented in Heliconius butterflies that have low fitness due to high predator attacks (warning coloration not recognized by predators)
Geographic modes of speciation
Sympatric speciation where reproductive barriers arise between populations that have not been geographically separated
Parapatric where neighboring populations between which there is a moderate gene flow undergo adaptive divergence and become reproductively isolated
Allopatric where geographical distance drives reproductive isolation
Allopatric speciation
first initial geographic separations occurs, divergence in different areas develops (largely through genetic drift) where there is limited or no gene flow between areas, and reproductive isolation in sympatric species occurs, there may be a narrow area where hybridization can occur but hybrids tend to have less fitness and eventually get selected against
Peripatric speciation (perimeter)
where a founder event from a widespread species allows the species to be present in the new habitat and speciation occurs due to geographic isolation/genetic drift and new selective pressures in the new environment lead to new haplotypes (common in island species, also has been documented in fruit fly species, haplotype trees can reveal this)
It is controversial whether or not the genetic drift from founder event is alone sufficient for speciation, but lab experiments on 50 bottlenecked populations of fruit flies over several generations did not lead to assortative mating within bottlenecked population indicating genetic drift along does not contribute to speciation
Parapatric speciation
Neighboring populations where there is moderate gene flow diverge through natural selection and become reproductively isolated. Because of different selective pressures in different environments, migrants are maladaptive and selected against contributing to speciation. This means that natural selection overrides the diluting effects of gene flow and divergent selective pressures therefore lead to reproductive isolation
Evidence for incipient or early stage parapatric speciation
Reproductive isolation has already arisen between metal tolerant and intolerant grass populations and shifts in flowering time and increased self-fertilization have led to reproductive isolation between metal tolerant and intolerant populations, this leads to reinforcement
Reinforcement
where natural selection favors the evolution of reproductive barriers between populations that are adaptively differentiated in order to prevent wasting reproductive effort on low-fitness offspring
Sympatric speciation
the evolution of reproductive barriers in a single population that is initially random mating, this requires the evolution of reproductive isolation without spatial isolation. It can occur from assortative mating (commonly based from differential resource use) or genome doubling (polyploidization)
How can assortative mating develop?
If natural selection favors divergence in resource use (disruptive selection) and if mating is directly correlated with resource use, then disruptive selection that results from differential resource use will lead to assortative mating based on resource. This form of reproductive isolation has no gene flow barrier between diverging populations
Sympatric speciation in Rhagoletis pomonella
Hawthorn is a native host for Rhagoletis flies and they lay eggs on different fruits where larvae develop. However, when apples were introduced to the US this led to reproductive isolation where populations that infected apples inherited a preference for apples. A big reason for this reproductive isolation was the differences in timing of maturation of hawthorn vs apple fruits. Because populations that infect apples infests fruit earlier and matures earlier, this factors into reproductive isolation
Sympatric speciation vs polyploidization
15% of speciation events in flowering plants are through polyploidization and most have polyploidy in their ancestry where they have greater than 2 sets of homologous chromosomes and this occurs through the generation and fusion of unreduced gametes
Autopolyploid
When all chromosomes are from one species
Allopolyploid
When chromosomes in an individual are derived from different species, commonly involves hybridization
Characteristics of polyploids
Polyploids are commonly a case of “instant speciation” since they tend to be reproductively isolated from diploid relatives (tetraploids produce diploid gametes and diploids produce haploid gametes) and they are often phenotypically distinct which allows for divergence via natural selection, in plants this can lead to shifts to larger flower sizes, different pollinators, shifts in flowering time, and shifts in reproduction
Testing whether speciation is allopatric or sympatric
The extent of range overlap as a function of time since speciation can be observed in a clade of related species. If allopatric speciation occurred, the degree of range overlap only increases with time and if sympatric speciation occurs, range overlap will likely remain the same or decrease over time
Adaptationist Paradigm
Assumption that all traits exist because they serve some type of adaptive function, this generally involves coming up with storytelling or untested adaptive explanations
Methods to test whether a trait is actually adaptive
On a microevolutionary timescale, this is a straightforward process where phenotypic variation can and their effects on fitness can be directly examined
But when looking at macroevolutionary relationships, this is often not possible since traits may have evolved millions of years ago
Comparative approach
Looking for evidence that specific environmental conditions repeatedly favor the evolution of specific traits in different species, straightforward if we look at convergent evolution in distantly related species. However if this approach is applied to related species, the possibility that similar traits reflect shared ancestry rather than convergent evolution has to be factored out
Limitations to comparative approaches
in tropical bats there is a correlation between social group size and testes size where individuals in larger social groups have larger testes. Correlation may suggest that larger testes have evolved as an adaptation for sperm competition in larger social groups where there is more of an opportunity for mate competition, however, the phylogenetic relatedness of the species also needs to be taken into account
One risk is that data may demonstrate correlation even though variables are correlated since traits can reflect a common ancestry, difference is due to evolutionary divergence. In bats when the group size and testes size of 6 independent populations were mapped they were shown to be correlated, but it may be possible that populations with small populations and small testes and vice versa may reflect common ancestry
Phylogenetically independent contrasts
method of comparing differences between populations that explicitly takes phylogenetic relatedness into account. This method looks at independent traits in sister species that represent phylogenetically independent divergence from a common ancestor. Method involves comparing the difference in one set of trait values vs the difference in the other set of trait values. Ancestral species are assumed to have a trait value made up of the average of two daughter species
When using this method to map out the correlation of species size with testes size, data found that bat species that evolve larger social group sizes relative to their sister species also evolve larger testes
Developmental Biology
How does the spatial/temporal expression of genes during development lead to adult morphology
Evo-devo
How does the evolution of developmental genes/pathways lead to morphological evolution and diversification among species?
DNA binding transcription factors
bind regulatory regions of other genes that control gene expression, they can act as the “master switch” regulators of gene networks
Since humans and chimps share 98.5% of DNA yet are very different, this suggest that many major phenotypic differences arise through changes in a few regulatory genes rather than many structural genes
Homeotic genes
transcription factors where a gene’s products provide positional information in an embryo. Mutations in these genes can lead to the development of a structure in a position where another structure would normally develop
Hox genes
occur in clusters on the chromosome and reflect past gene duplications, duplications are associated with morphological complexity and differences in expression are associated with morphological divergence between species over time
Contain highly conserved 180-bp homeobox that encode protein domains that bind the promoter regions of genes that they control
Using spatio-temporal along with the quantitative colinearity of expression, so expression tends to linearly reflect amount of duplication across space and time
Disruption of hox genes can lead to homeotic mutations
Do Hox genes function as homeotic genes?
Yes but their evolution has been found to predate their function as homeotic genes and they have also been shown to occur in animals without a differentiated body axis (such as sponges, plants, fungi)
Increased morphological complexity is associated with increases in the number of Hox genes as well, arthropod lineages also have complex and unique patterns of gene expression
Two tempos of morphological evolution that can be distinguished
Gradualism which describes small incremental changes over time, this was one of Darwin’s 5 Theories of Evolution
Punctuated evolution which describes occasional bursts of major morphological change followed by periods of stasis, where there is no morphological evolution apparent in the fossil record. This is found more commonly than gradualism
Punctuated Equilibrium
Punctuated evolution coupled with bursts of new species appearing simultaneously, S. J Gould argued that this accounts for most macroevolutionary patterns and patterns of speciation
Potential explanations of punctuated evolution in the fossil record
Artifact of the incomplete fossil record. There is a lack of good temporal, geographical sampling for most taxa
Stasis is real and reflects stabilizing selection, this means there is a long term selective maintenance of optimal traits. One hypothesis is that some species shift their ranges on a macroevolutionary time scale to stay in their optimal habitat, which is known as habitat tracking. It is therefore easier to move locations than the evolve morphologically
Stasis is real and reflects evolutionary constraints. Genetic, physical, and developmental processes all constrain which phenotypes can be produced (not likely for amphibians to have more than 5 toes since there are developmental constraints on vertebrate limb development)
Bursts of change are real and reflect evolutionary saltation which is a sudden major phenotypic change or “jump,” Richard Goldschmidt argued that most speciation occurs through macromutations which generate “hopeful monsters” which are new species that arise in a single generation as mutants, this can occur via polyploidy or by homeotic mutations (Termitoxenia has a wing morphology that suggests an origin by homeotic mutation)
Coevolution
interactions between species lineages over time leading to reciprocal adaptation, can be antagonistic between predators/prey, parasites/hosts, pathogens/hosts, or mutualistic where both species benefit through mutual exploitation
Coevolution and interspecific specialization
Can be specific (pairwise coevolution) where evolution is limited to two species/lineages or diffuse (guild coevolution) where there are multiple interacting species
Mirror image phylogenies
co-speciation of interacting species lineages, aphids and Buchnera, bacterial endosymbionts, are transmitted from one generation to the next via vertical transmission
Host shifts
when insects change their plant host, this can lead to incongruent phylogenies between co-evolving lineages since host species can be chemically similar but different species
Factors that favor virulence
Horizontal transmission which is the transmission among contemporary hosts in a population from parents to offspring
High pathogen transmission rate where there is a favoring of maximal host exploitation (low risk of getting stuck in a dead host)
Vector-borne transmission where it doesn’t matter if the host is too sick to co-mingle with others (cholera, malaria), have higher deaths per infection
What would a haplotype tree indicate about positive selection?
When positive selection is present for an advantageous mutation, fewer older alleles are expected to be prevalent in the population than with neutrality since newer alleles that are positively selected for will become more widespread so most alleles will be recent descendants of a favored allele, which is a shallow haplotype tree. Under this model, haplotypes are closely related and no long internal branches are present but there are many short tip branches
What would a haplotype tree indicate about balancing selection?
Balancing selection is likely present when selection favors the maintenance of two allele lineages that would go extinct otherwise, long internal branches are likely present. Some traits that can lead to balancing selection are heterozygote advantage, negative frequency dependent selection, disruptive selection, environmental heterogeneity. All of these will lead to balancing selection at the molecular level