Genetic Drift Flashcards
What is genetic drift/random sampling?
changes in relative allele frequencies due to random events (ex. disasters like hurricanes)
Is genetic drift more influential to variation in small or large populations? why?
small - there’s a higher chance of losing alleles (genetic diversity) if there’s fewer individuals
T or F: genetic drift is an important driver of evolution
true
T or F: direction of genetic drift is unpredictable
true
Why is the direction of genetic drift unpredictable?
genetic drift is a result of RANDOM events, so whether or not, and which alleles will be lost or retained and in what frequencies is totally random
How does genetic drift effect variation within a population (increase, decrease, maintain)?
it reduces variation within a population because alleles can be lost
how does genetic drift reduce variation within a population?
results in loss of alleles –> increases homozygosity
What phenotypes did the 2018 (first) study by Donihue and Losos look at?
they measured toe pad size against body size of lizards before and after (surviving) a major hurricane (genetic drift event) to determine whether natural selection is acting on, and selecting for, larger toe pad size
Why would increased toe pad size and shortened femurs in the island lizards result in increased fitness?
larger toe pads and shorter femurs compared to body size results in a better ability to grip onto branches during a hurricane = increased survival
What did the 2018 (first) study Donihue and Losos find?
their figure showed a clear difference between the average toe pad size (compared to body size) of lizards before and after the two hurricanes. they found that afterwards, the lizards had larger average toe pads
How does the 2018 (first) Donihue and Losos study provide evidence for natural selection?
their study shows a fitness difference between phenotypes (larger vs smaller toe pad size) by finding the average toe pad size after a hurricane event was larger than before the event (those with larger had higher fitness)
Why does the 2018 (first) Donihue and Losos study not provide evidence for evolution?
even though there’s a phenotypic difference, there’s no evidence that the larger toe pad size is genetic/heritable and for evolution to occur, the phenotype with increase fitness must be heritable
What could Donihue and Losos do to determine the heritability of the toe pad phenotype?
- study the same experiment in the next generation for those traits
- analyze information on the genetic basis of toe pad size
What did Donihue and Losos do in their follow-up study to gain more evidence for evolution of toe pad size?
they looked at the next generation in the same lizard population and found the same results: after the hurricane in 2019, the toe pad sizes were larger
this provides evidence that there’s a connection between survival and the next generation (heritability)
What was the overall major finding of the work by Donihue and Losos?
they provided strong evidence supporting that hurricanes are causing the evolution of toe pad sizes in these island lizards
= on islands where hurricanes were more frequent, the lizards had larger toe pads (across different species and in different locations)
How does genetic drift affect the relative frequencies of genotypes?
it increases homozygosity and decreases heterozygosity because it causes the loss of alleles
What is the expected result in terms of genetic drift from a pop G simulation if the population size is large (10,000 individuals), both alleles are at 0.5 frequency, over 100 generations and no mutations, migrations, or natural selection?
the allele frequencies exist about the same over time and hover around the zero genetic drift line, P(A) = 0.5
How can we use pop G to simulate the effects of genetic drift?
by lowering the population size, keeping allele frequency at 50%, no introduced mutations, migrations, and no natural selection (relative fitness = 1 for all) to compare to the zero genetic drift line
What happens in pop G when the population size is decreased?
the frequency of A and a fluctuate more and more as the population size decreases = more genetic drift occurring
some of the populations even become fixed for A or a (ie., one of the alleles is lost) = reduction in variation
T or F: the reduction in variation (fluctuation of A and a frequency) in small populations is due to fitness differences between the alleles
false = when genetic drift is a strong influence (when pops are small), the loss of alleles or change in allele frequency is due to chance
What is the evolutionary result of genetic drift (ie., across multiple populations)?
divergent evolution of populations (populations of the same species that have different allele frequencies)
What is a relevant example of genetic drift in humans?
microcephalin - this was not due to fitness differences (no natural selection), just random chance
Explain how bowling can be a metaphor for genetic drift
the width of the bowling lane can be considered the population size = the narrower the lane, the easier it is for the ball to drop into the gutter = an allele to be lost
the proximity to the edge of the lane can be considered the allele frequency = the closer the ball is to the gutter, the more likely it is to go into the gutter = the lower the frequency of an allele, the more likely it is to be lost
What is a genetic bottleneck?
a reduction in population size to a small size
T or F: genetic bottleneck events in a population increase the effects of genetic drift and founder effects
true
what are founder effects?
when genetic variation is dependent on the allele frequencies present in surviving individuals after an event which reduced population size (ie., genetic bottlenecks)
What can cause genetic bottlenecks?
population crashes due to sampling events such as environmental disasters
colonizing a new population from a small number of founder individuals
How long can genetic drift/founder effects continue influencing variation in a population?
a long time, regardless of population growth rate
Explain how elephant seals are a good example of a genetic bottleneck and resulting, persistent genetic drift/founder effects
Dramatic crash in population size caused by over-hunting, down to ~40 individuals in 1850
since, the population has been allowed to rebound back up to ~100,000 but the genetic variation present in that population today is all based on what alleles were present in the 40 survivors in 1850
= genetic drift and founder effects still persist hundreds of years later and regardless of rebounded pop size
Why might we want to quantify the effects of random sampling/genetic drift?
in order to predict how population sizes can affect the frequency of A and a alleles
How can we quantify the effects of random sampling/genetic drift?
Use Hardy Weinberg set up to determine what allele and genotype frequencies look like in a large population prior to a sampling event
then use the binomial distribution-based Wright-Fisher model of genetic drift to compare the probability of the genotype frequencies after a sampling event
What are the steps of quantifying the effects of random sampling?
- large population size with 2 alleles (A, a) with equal frequency (p = q = 0.5) and genotype frequencies = 1/4 AA, 1/2 Aa, 1/4 aa (HWE: p2, 2pq, q2)
- random sampling event occurs causing a population crash and only a very small number of individuals survive, ex. n = 4. What is the probability that just by chance, all 4 have the AA genotype?
- based on binomial distribution, if AA has a 1/4 frequency in the original population size and there’s 4 individuals in the population –> (1/4)^4 = 1/256 chance that the 4 surviving individuals are all AA
Explain the Wright-Fisher model of genetic drift and binomial distribution
Binomial distribution assumes that each individual in the experiment has an equally likely chance of being selected = this is why the Wright-Fisher model uses this probability distribution method because individuals in a population have an equally likely chance of surviving a random sampling event
Wright-Fisher model can be used to predict the probability (frequency) of the different genotypes with the number of individuals in the population and their allele frequencies
What was the classic genetic drift experiment that was basically a real life pop G simulation?
Peter Buri assessed the effects of genetic drift in 1956 - he looked at the frequency of alleles (started at 50/50) in 107 small populations (subject to genetic drift; n = 16, including both males and females) of Drosophila flies over 19 generations
What did Peter Buri find from his Drosophila experiment?
even just from the initial population to the first generation, there was already spread of allele frequencies
after just 5 generations, some populations started to show a huge decrease and near losses of alleles
After 19 generations, he found that 30 populations became fixed for one allele and 30 became fixed for the other
How did Peter Buri’s experiment compare to the Wright-Fisher model of genetic drift?
Buri’s results showed that over a shorter period of time, more populations lost one of the two alleles and there are fewer populations retaining both A and a (on a graph, larger bars on the ends - like the gutters)
whereas,
the WF model shows more populations retaining both A and a alleles (slower loss of variation)
lower peaks on the ends (gutters) = over 19 generations, there were less populations fixed for one allele and more populations with both
Why are Peter Buri’s results different from the Wright-Fisher model of genetic drift? Given that they both use n = 16 and initial frequency of alleles = 0.5
the WF model is THEORETICAL which means it doesn’t account for the actual number of reproductively active individuals in a population and assumes all 16 individuals are contributing to reproduction
In real life, just because there’s 16 individuals, it doesn’t mean all 16 are reproducing and if there’s less reproducing individuals, the population size is EVEN SMALLER so we would see an even greater genetic drift effect (faster loss of alleles)
How could the Wright-Fisher model of genetic drift be adjusted to more accurately match the results observed by Peter Buri?
if the population size was decreased to less than 16, it might resemble Buri’s results more due to the effective population size (number of individuals contributing to reproduction) usually being smaller than the actual population size in real life populations
What is the effective population size?
Ne
the amount of individuals within a population contributing to reproduction - not every individual within a population is contributing to reproduction
it is the size corresponding to genetic drift (not the actual population size)
Will Ne be smaller or larger than N?
smaller
How does the effective population size effect genetic variation within a population?
the Ne will always be smaller than the population size, sometimes it can be much smaller
genetic drift has a stronger effect on smaller populations, and since the effective population size is the true representation of the population size, if that value is small, alleles will be lost at a faster rate = genetic variation will be reduced
How can bottleneck/random sampling events affect the effective population size?
after a population crash/in a small population, the ratio of male to female individuals and the proportion of individuals capable of reproducing is RANDOM and likely not equal
ex. a population may now only consist of males in which case there’s no possibility of reproduction
ex. there may be no or very few reproductively mature individuals = alleles not present in the reproducing individuals will be lost
Explain how elephant seals are a good example of effective population size effecting genetic variation
after their numbers were dramatically reduced to ~40, there was asymmetry in the number of males present and the number of males able to contribute to reproduction
only the reproducing males would have been able to pass on their genes, and alleles in non-reproducing males would have been lost
so the founder effects are even stronger in this population = the current ~100,000 individuals have alleles based on probably even less than 40 individuals
results in very low genetic variation
What is another example of effective population size contributing to genetic variation?
In bee colonies, only the Queen bee (a single female) contributes to reproduction = very small effective population size compared to actual population size
What is the effective population size in humans? how does it compare to the actual population size?
Ne = 10,400 (estimated)
N = ~8 billion (estimated)
Effective population size is difficult to measure, so how do we do it?
we estimate it using the Wright-Fisher model
What happens in a small population when a new neutral (no effect to fitness = no natural selection) mutation arises?
the smaller the population, the higher probability that the new neutral mutation will become fixed
= the new mutation will replace other alleles just by chance because of the small pop size
How does a small population size affect the probability of a new neutral mutation becoming fixed?
the smaller the pop, the more likely a new neutral mutation will become fixed
What happens in pop G when:
pop size = 100
pop number = 100
generation = 500
neutral mutation = fitnesses = 1
new mutation introduced, frequency of A = 0.99
just by chance, at least one population (ex. 99 of the 100 populations) becomes fixed for the new neutral mutation (A allele)
= the introduced neutral mutation (A allele), not at all due to fitness differences (no natural selection), is expected to replace the a allele in at least one (ex. 99 of the 100) populations
What happens in pop G when:
pop number = 100
generation = 500
neutral mutation = fitnesses = 1
new mutation introduced, frequency of A = 0.99
but you increase the population size?
As population size increases, less populations become fixed for the new mutation = the new mutation is less likely to replace the a allele just by chance
ex. at N = 10,000, only 50 of the 100 populations were fixed (compared to the 99 fixed when N = 100)
What happens in a small population when a deleterious mutation arises?
it is more likely to become fixed
What happens in pop G when:
population size is small (10-100)
pop number = 100
generation = 500
deleterious mutation: fitness of AA = 0.75, Aa and aa = 1
new mutation introduced: frequency of A = 0.99
When the population is small and the new mutation causes decreased relative fitness (AA = 0.75), more populations become fixed for the A allele and few lose it
ex. N = 10, 89 of 100 populations become fixed for the deleterious mutation A
ex. N = 100, 47 of 100 pops become fixed for A
will genetic drift or natural selection have a stronger effect on allele frequencies?
it depends on how strongly the processes are acting on a population
How can we determine the strength of genetic drift and natural selection to compare them?
by calculating s vs Ne as measures
s= strength of natural selection (fitness & differential fitness)
1/Ne = strength of genetic drift (pop & Ne size)
What does it mean if s > 1/Ne?
natural selection will have a stronger influence on the allele frequencies
What does it mean if s < 1/Ne?
genetic drift will have a stronger influence on the allele frequencies
If s > 1/Ne, will Ne be large or small?
If s < 1/Ne, will Ne be large or small?
s > 1/Ne = Ne will be large (larger pop size, less genetic drift)
s < 1/Ne = Ne will be small (smaller pop size, more genetic drift)
T or F: if genetic drift is the prevailing force, natural selection cannot also be occurring
false, natural selection can still be acting, it just won’t be as strong as the effects of genetic drift
in what population sizes are new slightly advantageous mutations (ex. fitness of AA = 0.99, Aa and aa = 1) more likely to become fixed? what does this show?
large populations where there’s no genetic drift
this takes a VERY long time (ex. Pop G = 500,000 generations) and very large Ne
this shows the balance between chance and fitness (ie., genetic drift and natural selection)
in what population sizes are new neutral mutations more likely to become fixed?
small populations where genetic drift is strong
in what population sizes are new deleterious mutations more likely to become fixed?
small populations where genetic drift is strong
Describe the study about inbred wolves on Isle Royale
the population was colonized on Isle Royale (an island) in 1950
population size remained small - fluctuated around ~25, max was ~50 but eventually dwindled down to 2 wolves
small population and isolation led to strong genetic drift effects and inbreeding = decreased fitness –> losing variation and mating with each other caused deformities and population crash
genetic rescue was proposed to introduce gene flow from mainland wolves
What were the major results of the Isle Royale wolf study?
the population size has increased due to new individuals but also, more importantly, due to higher survival rates in offspring (less inbreeding = more diversity in alleles = higher fitness)
What is gene flow?
movement of individuals (or gametes) between populations
How can gene flow effect population structure?
the introduction of new genes into a population can change the genetic composition (the allele frequencies) into the population = increases genetic diversity
What is the major benefit of gene flow?
it increases genetic diversity in a population
What evolutionary effect does gene flow have on multiple populations?
by introducing genetic diversity within populations, it counteracts divergent evolution, caused by genetic drift, of populations = homogeneity across populations
How are gene flow and genetic structure measured?
using F statistics - developed by Sewall Wright
What are F statistics?
F = Fixed
a way to measure variation between populations by comparing the ratio of heterozygosity to homozygosity - specifically measures the loss of heterozygosity
What was Sewall Wright’s major study organism?
guinea pigs
What is FST?
a measure of population differentiation = the different allele composition in different populations of a species = a measure for how genetic variation is organized between populations
What is the range of FST values? What are the basic interpretations of the value?
0 to 1
where <0.05 is very little structure and >0.25 is a lot of structure
What does it mean for a population to have little structure? How much gene flow is there?
there’s a lot of gene flow and the distribution of A and a alleles between populations is similar = both populations have similar composition
What does it mean for a population to have a lot of structure? How much gene flow is there?
there’s not much gene flow and the distribution of A and a alleles between populations is very different = the populations look very different
What does it mean when FST = 0?
populations have the same alleles in the same frequencies
lots of gene flow
What does it mean when FST = 1?
populations are fixed for different alleles = there is only homozygosity
no gene flow
What is the formula used in F Statistics?
F = (expected H - observed H) / expected H
where H = heterozygosity
What is the formula used to calculate FST?
FST = (total H - subpopulation H) / total H
where H = heterozygosity
What are the challenges for F statistics?
it’s difficult to describe what defines a population or subpopulation
it can be difficult to describe what may be preventing gene flow or if gene flow is even being prevented
Calculate FST for this example:
2 alleles that determine coat colour in wolves, A and a. Sample 5 subpopulations and find the following frequencies of the A allele (p):
0.2, 0.25, 0.6, 0.8, 0.00
interpret the FST value
- calculate heterozygosity (2pq) for each subpopulation
- take the average of the 2pq values to find the subtotal expected heterozygosity value (HS)
HS = [2(0.2)(0.8) + 2(0.25)(0.75) + 2(0.6)(0.4) + 2(0.8)(0.2) + 2(0)(1)] / 5 = 0.299
- find overall frequency of A allele from the average frequencies:
overall p = (0.2 + 0.25 + 0.6 + 0.8 + 0)/5 = 0.37
- use total p to calculate total expected heterozygosity HT (2pq)
HT = 2pq = 2(0.37)(0.63) = 0.4662
- calculate FST with formula:
FST = HT - HS / HT
FST = (0.4662 - 0.299) / 0.4662 = 0.3586
interpretation: FST is relatively high (>0.25) - more reduction in heterozygosity = populations have very different frequencies of A and a alleles = lots of structure, more homozygosity in populations –> less gene flow and pops look different
Can FST be high for some genes and also low for others in the same organism or population? why?
yes
When the FST is very different for one allele (ex. c allele found in 1 population out of 4, but A found in all 4), it is not being transferred into the other populations via gene flow even if gene flow is occurring between populations because
maybe there’s some ecological niche difference between the populations where that allele is favoured by natural selection in one environment but not the other - it is not as advantageous in the other pop so it will be selected against
basically because positive selection does not effect genomes in the same way in all populations because there’s other factors contributing to what is advantageous (ex. melanic moths in polluted vs. non polluted environments)
How is gene flow effected by positive natural selection?
positive natural selection will act differently on the genomes of individuals in different populations because there are other contributing factors influencing fitness
How are melanic moths a good example of how gene flow is effected by natural selection and how FST can vary?
the melanic A allele is favoured in polluted environments whereas in non-polluted, the wild type a allele is favoured
gene flow could still be occurring between these populations but the transfer of the A or a allele into the opposite population is not advantageous
How would genetic drift be expected to act across different populations of the same size? how does this compare to gene flow?
genetic drift would be expected to act in the same random way in all small populations of a species
whereas
gene flow does not act randomly because it is influenced by natural selection not random events
What can genes with outlier FST values (when compared to the rest of the genome) tell us?
these outlier FST values can indicate strong positive selection (and adaptation)
where the other genes in the genome show the baseline amount of genetic structure and gene flow to use as a comparison for any deviations
How does artificial selection affect gene flow?
it prevents gene flow by artificially selecting for traits to get the same product every time (ex. dog breeds)
What was the goal of the study by Akey et al., about artificial selection in the dog genome?
they were looking for any significant deviations of FST values in the dog genome across different breeds to determine signs of positive selection (ie., mutations associated with particular dog breeds)
What did Akey et al. do in their study of the dog genome?
they genotyped ~21,000 autosomal SNPS in 10 dog breeds such as sharpeis, german shepherds, poodles
When looking specifically at the Shar-pei genome, what did Akey et al. find?
there were 9 regions within the genome that had FST outlier values
When looking at the dog genomes, what does di mean?
FST deviation for breed i (compared to the average FST value for the entire genome)
Looking at a graphical representation of the Shar-pei genome, how can we tell which regions had FST values that were outliers?
they reached above the dashed red line
Why did we look more closely at a section of chromosome 13 from the Sharpei genome?
it had an FST value with significant deviation from the baseline for the rest of the genome
What 3 genes are in chromosome 13 of the sharpei genome? which one did we look at in more detail?
HAS2, SNTB1, FTSJ1
HAS2 = Hyaluronan synthase
What is hyaluronan?
aka hyaluronic acid
found throughout the body (ex. joints, eyes, skin)
produced by hyaluronin synthases, including and especially HAS2
function: stretches skin, increases wound healing
How is hyaluronan related to Sharpeis?
sharpeis have unique skin morphology that is super stretchy = hyaluronan functions in helping skin stretch
What did Akey et al. find in relation to sharpeis and the HAS2 gene?
that HAS2 amount was increased and expression was elevated in the Sharpei genome
How does the HAS2 gene in sharpeis compare to that in other dog breeds?
All dogs have hyaluronan and the HAS2 gene, and it is not a different type of hyaluronan being produced, but there’s just a higher baseline in the sharpei genome than in other dog breeds
which makes sense because they have way stretchier skin than other dog breeds
What other animal examples are there of increased HAS2 gene expression?
naked mole rats: produce very high levels of hyaluronan
- they have a nonsynonymous mutation in their HAS2 gene (so it is slightly different than sharpei’s)
humans:
rare disease, cutaneous mucinosis, caused by very high levels of hyaluronan production
without isolation experiments, and using only sequence analysis, how could you determine whether a gene (or part of a genome) is essential?
look at population samples or across species to see whether the gene or genome section is present in all of them = if it’s highly conserved it’s a strong indicator that it’s essential
ie., look at genetic variation across different groups = low variation means strong purifying selection is acting to conserve the gene or segment
also can compare amount of nonsynonymous vs synonymous mutations
What does it mean if a gene or section of a genome is highly conserved across populations or species?
it is most likely essential
and there’s strong purifying selection acting to keep it
What is an example of two divergent lineages sharing highly conserved genes?
Human and yeast ubiquitin proteins
the amino acid sequence is 96% similar
the nucleotide sequence is 75% similar
How can we determine the amount of variation in a species?
use dN and dS to compare the amont of nonsynonymous and synonymous mutations
What is dN?
What is dS?
dN = a measure of nonsynonymous mutations within a species
dS = a measure of synonymous mutations within a species
In general, is dS > dN? or dN > dS?
dS > dN = there’s more synonymous mutations than nonsynonymous
How can we determine the amount of variation between species?
use KA and KS to describe nonsynonymous and synonymous divergence
What is KA? What is KS?
KA = a measure of amino acid changes in the amino acid sequence caused by nonsynonymous mutations
KS = a measure of substitutions in the nucleotide sequence caused by synonymous mutations
When there is strong functional constraint (strong purifying selection), what is the ratio of KA to KS? dS and dN?
KA / KS = close to 0
dS > dN
there’s more synonymous mutations present and purifying selection is acting to remove the nonsynonymous mutations
What did the study by Kachroo et al. about humanization of yeast genes do?
they replaced 414 essential yeast genes with their human orthologs (the human version of the same gene) to determine how many could be replaced and have the culture still survive
using a temperature-sensitive allele inactivation assay
When did humans and yeast diverge from a common ancestor?
~1 billion ya
What are orthologous genes?
genes with a shared ancestry (found in both lineages from a shared ancestor)
How many orthologous genes do humans and yeast share?
thousands (~1/3 of the genes in the yeast genome are shared with humans)
What did the study by Kachroo et al. about humanization of yeast genes find?
they found that ~half (47%) of the yeast genes could be ‘humanized’ (replaced) by the human ortholog and the yeast culture would still survive
What can be interpreted from the results of Kachroo et al.?
the orthologous genes and their functions are highly conserved across both the human and yeast genome
What selection can we expect when KA/KS or dN/dS < 1 (~0)?
strong purifying selection is acting on to remove the synonymous mutations
ex. ubiquitin in humans and yeast
What selection can we expect when KA/KS or dN/dS = ~1?
no strong purifying selection
there’s not much fitness differences so not much selection action = neutral evolution
ex. pseudogene (nonsense mutations) or recent gene duplication
What selection can we expect when KA/KS or dN/dS > 1?
strong positive selection
lots of nonsynonymous mutations and few synonymous
NS is favouring more mutations that cause AA changes or nonsynonymous mutations = more variation
ex. MHC
genes involved in immunity will have more variation to increase effectivity