Population Genetic - Drift Flashcards
When does Genetic Drift occur
Genetic drift occurs if you do not have infinate popultiion size
- Occurs if you violate infinate popultion size
MEANS it is possible it is going on in all popultions
How important is genetic Drift
Very important – tied with Natural selection (almost if not just as important as natural selection)
Where is drift important
Important in small popultions BUT it is not absent in large popultions
- Can still have drift in large popultions
Important for variations within/between species
What does H-W mean by infinate popultion size
Means that the ganete pool is sampled exuhstivley
- Every single copy of allele in gamete pool gets represented exactley once in the next generation
Example of infinate popultions
If we sample a subset of the gamete pool - the larger our sample size rhe closer we will ve to the actual allele frequncey in the underlying popultion (larger subset/sample size = closer we are to actual allele frequncey)
What happens when we subset popultion
If we subset a popultion = we open ourselves uo to error
If sample 50 alleles out of total popultion–> we might get the actual allelle frequncey but we might now = drift
If we are not sampling fully = open to drift
How do we get drift
Has to do with sampleing error –> if we do not sample the popultion fully then open to sampling error + open to drift that has to do with sample errior
- Sampling error in gamete pool = open to chnage in allele frequncey = drift –> continous mechanism in popultion
How can you be sure you get the actual popultion allele frequncey
The only way to be sure that we land exactley on the popultion allele frequncey (sampleing with replacment) is to take infiniate number of samples
If we take a fixed number = have error UNLESS sample exhustivleyt + with replacment – only way to be sure you get same allele frequncey = need to do infinate amount of times
Parts of genetic drift (thing that influences it)
- Random mortality
- Sampleing errors in zygote formation
Random mortality
Part of genetic drift BUT not needed
Why do we need with replacment?
Same product of replication doesnt mean everyone is reproducing the same amount –> if eeveryone gives 2 alelles and have the same probability still don’t know which allele goes to Zygote –> have possibility of mismatch in second genertaion if use with replication
Larger sample size + dirft
Larger smaple size = more sure we are to get the same intial underlying frequency
What happens if lose infinate popultion size
As soon as we ease off this assumption – chance events can start to influence allele frequencies
True evern in very large popultions
Large vs. Small popultions + drift
Can have drift in large popultions BUT its more pronounced in small populations
Genetic Drift
Random chnages in allele frequcneues in popultion
What causes genetic drift
- Due to sampeling errors in zygote fomration – main mechanism behind drift
- Also includes random death – individuaks bir surviving to reproduction in a way that has nothing to do with genotype/phenotypes
- Biased mortality at indiviuak level NOT biased mortality at genotype level
- Also can be due to random death.survival ecents where the proabability does NOT vary as a function of a trait or genotype
Overall: If not everybody gets a chnace to contribute to the next generation = then that is going to impose allele frequncey chnages even if every genotype has teh same fitness
- Randomness in replication/death missamplinng in gamete pool = change in allele frequncey = have evolution
- Chnage allele frequcney but in different way than NS
Why is Random death influence genetic drift
If not everybody gets a chnace to contribute to the next generation = then that is going to impose allele frequncey chnages even if every genotype has teh same fitness
- Randomness in replication/death missamplinng in gamete pool = change in allele frequncey = have evolution
Selection vs. Drift
BIG difference = in predictability
Selection = determanistic (if you know start then know how it ends)
Drift = Stochastic (probablistic) – don’t know exatcley what will happen because it is based on random sampling
- NOW using a different perspective (because proabbilistic) – don’t know exactley what will hapen because based on random sampling
- Can look at allele frequncies that are more likley
BOTH chnage allele frequencies but in different ways
Drift is…
Stichastic – we can’t predict the outcome
- If we know the starting point of the system we can’t know where it will end up
At a given point we might be able to calculate the proability of ending up at a particular state but we won’t know what will actually happen in that instance
- We can look at the most likley outcome from one generation to the next
Some aspects of dirft are…
Some general aspects of the outcome of drift are inevitable BUT we can’t know the end state (nor even the state of the next generation) for a particular popultion
- Has some inevitable features
How does drift work on model
Start: 60% A1 and 40% A2
THEN sample an infanite number of times to get back 60% A1 and 40% A2
THEN if we only sample 10 times – maybe we will get 60% A1 and 40% A2
- Instead of infinate zygites we just amke 10
If we make 10 zygotes we can get 60% A1 and 40% A2 but we can also get 6 A1A1 2 A1A2 2 A2A2 (This is very plausabile if we just choose 20 mice) - NOW 70% A1 NOT 60% --> NOW it is not the same because allele freuqncues are different = evolution is taking place
IF we sample 100 times maybe we will get 60% A1 and 40% A2 OR mayble we will get 61% A1 and 49% A2
THEN – the change in allele frequncey sticks around
Example #2 – how drift leads to chnage in allele frequncey (sampleing gametes with red and blue)
IF start with 50% A and 50% a (50% red and 50% blue) – have 20 alelles in gamete pool (have 10 red and 10 blue)
IF we sample a fixed amount of times (if we pick 20 gametes) –> we can get p = 0.5 and q = 0.5
- Each round the alleles have a 1 in 20 chnace of getting chosen BUT some might get chosen X2 abnd some might not get pciked at all
Here = sampleing 20 times with replacment = some might get chosen twice and some might not get chosen at all
- Means that you can start with 10 and 10 BUT end with 7 and 13 = have evolutiion
BUT – what if we did this many many time (Sampled 20 in many rounds each woth replacemnt)
Reuslt = stabilize around 50:L50
What happens when have repeated sampleing (if do samples of 20 many times)
If start with 50:50 –> End up stabilizing around 50:50
- If kept sampleing 20 again and again –> end up stabilizing around 50:50
- The highest proabbility outcome = 50:50 – end up stabilizing around 50:50
Avergae end outcome = 50:50 BUT you do not get 50:50 in each generation
If end at 50:50 –> why do populations chnage?
If when you get doing 20 samples with replrcemnt again and again ends at 50:50 (ends at same starting point) –> why do popultions change
Because in real popultions you are not reseeting back to the starting amount each generation
If start with 50:50 – each generation you are not resetting back to 50:50 (In the example you are being you put all red/blue back = restart 50:50 each round) –> Since not resetting means that next time you sample over the chnaged popultion – sample opver new popultion with chnaged allele frequnceies
- Have end change because the chnage sticks around = chnages the porobabilitie that we are sampling from
Ex. Go form 10 and 10 –> 7 and 13 NOW we are sampling from 7 and 13 (not goping back to 10 and 10) – now sampling new ratio
- The change sticks around
- Maybe now you will get 13:7 again or maybe you will get 8:12 – now going p = 0.5 –> p=0.35 –> p=0.4
- Each time we misample – something chnages – change sticks
THIS IS THE BASIS FOR GENETIC DRIFT
If our highest probability
outcome is to get the
same allele frequency,
why does this result in
evolution?
(If we end up stabilizing around 50:50 – why do we get evolution)
- In a single generation,
the ”on average” part
doesn’t matter – there’s
only one round of zygote
selection - Any sampling errors
(deviation of a subset
from the true population
values) stick - Even if started at p = 0.5 –> THEN The population is
restocked at the new
value of p = 0.35 TEHN you sample another 20 gametes BUT you are sampleing from p = 0.35 and then might get p = 0.4
Direction of genetic drift
It’s not directional – p can go up or down – and we can’t know the direction or magnitude of the change from one generation
to the next –> there is no analytical ∆p equation for
drift!
Experimnet – Most likley outcome from each sample
If start at p = 0.6 – when sampling gametes – most often get back to p = 0.6
- Getting back to p = 0.6 is the most likley outcome BUT in experiment it only occured 18% if the time MEANS that 82% of the time get something other that p = 0.6 –> 82% of the time get chnage – the liklihood of statying at 0.6 is low even though it is the most common outcome
- 82% of the time = get chnage in allele frequencey
- Standrad error = can avergae how far you are for subset sample size
How to we talk about genetic drift
You can look at the probability of chnage even though you can’t know if P will increase or decrease or the magnitude of chnage BUT you can get probability that P will chnage
***There is no dP equation for drift
Inevitable aspects of drift – cards experiment
If we sample cards (52 cards) with replacmemnt (can choose a card more than once) – at the end it is inevtiable that you will go to fixation for one card
END – inevitable you will only get one crad
- end will get a singole crad all 52 times
- You don’t know which crad but know you will fix for one card
- Sampling with replacement (so that you restock the deck each round) – you will inevitably end up with a single card at the end
- This happens because each time one is missed from sampling error, it’s gone for good
- Variation is lost through time
When know that you will end up with a single crad but we can’t know for sure which card that will be –> Means that dirft will tend towards fixation in the long one but don’t know which allele that will be
Inevitable aspects of drift
When sampling pools a finite number of times = eventually get fixation for one allele
- Means sampling errors are important – critical affect = when lose sometuing = lose it for good
- Each time one is missed from sampling error, it’s gone for good
Means that dirft will tend towards fixation in the long one but don’t know which allele that will be
Affect of drift = genetcually decrease – every time misample something and soemthing is not representaed = that copy is lost = lose variation
Affect of drift
When lose something = lose it for good == when you lose that varaition (when go to fixationf or allele) you lose varaition for good
- If varariation gets lost = varaition decrease
END = fix for one allele
Liklihood of picking certain alleles + liklihood of fixing for alleles
Every copy of each allele has equal probability of going to fixation (Probability of any one copy of A or any one copy of A is equally likley to reach fixation)
- In a given population, each individual allele segregating in the population has an equal probability of being the one that goes to fixation eventually (Doesn’t mean that p and q alelles are equally likley to go to fixation – means individual copies of alleles in the gamete pool are equally likley to go to fixation)
- If have 20 copies of an allele in a population then porbvility of any one copy if allele going to fixation is 1/20
IF P = 0.6 and q = 0.4 –> NOt equally likeley for p or q ro be fixed BUT each individuakl copy has the sameprobability
Each copy in the gamate pool has the same probability
Example – if have 20 copies of an allele – each copy has a 1/20 chance of going to fication
IF p = 0.6 and q = 0.4 –>
Probaboliyu pf P OR q going to fixation is Mutaually exclusive
- One copy pf p or one copy of q
Since ME (one copy of P will go to fixation or a different copy of P will go to fixation)
- P(One copy of P going to fixation) OR P(Andifferent copy of P going tp fixation) = 1/20 + 1/20 – since have 12 copis of P (the probability of any one copy going to fixation is mutation exulsive) – add all 1/20 = get 12/20 = 0.6
Probability of p OR q going to fixation
Probability pf P OR q going to fixation is Mutually Exclusive
ALSO – ME for one copy of P OR a different copy of P to go to fixation
Example of probability of P or Q going to fixation
P = 0.6 and q = 0.4 –> porbaility of P going to fixation = 60% and q going to fixation is 40% (because probability of any one copy gping to fixation is 1/20 and ME = can add individual orobabilities together)
— The probability being 0.4 is ONLY for the starting generation –> As soon as the allele frequncey chnages due to drift that porbabiloity resets
THEN if P = 0.45 and q = 0.55
NOW probability of P going tp fixation is 45% and Probability of q going to fixation is 55%
Probability of P or Q going to fixation
Probability of P or Q going to fixation = equal to the frequncey of P or Q in the popultion BUT only in that starting point – IF chnage frequncey then that chnage sticks around and change probability
Probability of P or Q going to fixation
Probability of P or Q going to fixation = equal to the frequncey of P or Q in the popultion BUT only in that starting point – IF chnage frequncey then that chnage sticks around and change probability
What happens to probability of allele going to fixation once allele frequencey chnages
As soon as the allele frequency changes due to drift, that probability resets
Example – if start with p = 0.4
IF the sampling error going to the next generation causes the frequcncey to chnage to p = 0.55
- NOW the proabbility of blue going tp fixation has gone from 0.4 to 0.55
NOTE: There was nothing intrinsic about blue that made it less likely to go to fixation in the last generation
Drift + population size
Larger populations – or more
specifically larger numbers of
zygotes being produced
between generations in those populations – produce less sampling error = less drift
- Larger sample size = smaller sample error
- Larger population = apprach getting back same allele frequncey
- As popultion size gets vigger = get ckoser to starting allele frequencey
- If you lose a little bit of varaition = negligible drift in large population BUT in small popultion this can cause a large amount of change
(Just as a larger sample size gives us more accurate results in statistics)
Have inverseley proportional relationshio between popultion size and afefct of drift (Smaller = higher affect of drift)
Drift in small populations
Drift is a stronger evolutionary force in small populations (harder for NS to recon with)
- True for small popultions that stay small for many generations (like rare or endagered species) AND it is also true for events in large popultions that cause temperary popultion reductions
- True in popultions that are small and stay small AND true in otherwise large popultions that go through period of population contraction
Affect of drift after larger popultion that contracted increases back in size
Affect of drift stays around even after gets larger again
2 terms describing evens that lead to strong drift in otherwise large popultions
- Founder events
- Bottlenecks
Cause otherwise large popultion to go through contraction
BOTH have short lived stage of small popultion
Founder events
A new popultion is derived from a small number of individuals drawn from a large ancestral population
- Start popultion derived from small amount of individuals
Ex. A small numver of people go to colonize islands
- The allele frequncey of the new popultion has lots oif drift = very different allele frequncey (frequcney is very different than ancestral population)
Bottleneck effect
A population’s history is marked by one or more generation of very small population size before regrowth
Bottle neck vs. Founder
Bottle neck = not starting new popultion – have a popultion that already exists – popultion is already in a fixed place that THEN goes through a collapse and then bounces back
FOR BOTH – the effect of drift while in small popultion size carries over even if the popultion goes back to a larger size
Example of drift in huamns – Salinas
Village of Salinas in a remote part of Dominican Reprublic
- Population of 4300 in 1970s
- Founded 7 generations ago by much smaller number of people
- At founding = had very little immigration into area – mostly had gene pool from families who settled there
- In the founding = had disproportional representation – tracked back to know guy that had kids with 4 women = became over represented in the gene pool –> The guy happened to be a carrier for 5-alpha-reductase-2 (Phenoype in homozygous = have malfunctioning copy of sex hormone in utero (Have testrone –> DHT – DHT is the type of Testrone that the fetus responds to) – have malfunctioning copy of sex hromone – they are mot mkaing the right version of testrone – without the mascularization cue = fetus developes as female – have XY –> Have XY individual with female genetelia – presents as female even thoough XY
- Stay presenting as female until bodu begins repsonding to testrone at puberty and swicth – born as anatomical female and swutch
- Everyone was used to it in female – 1% of girls swicth at puberty to males
Male vs female in Utero
Make testrone differentley/res[ond to testrone differentley –> difference in detecting testrone
DHT is the version the hormone
that really triggers external
masculinization during
development
Muttaion in Demoncan Reprublic island
10% of popultion = has the allele –> 1% of XY have this phenotype
- 10% maintained – how is this mainatined
How is mutation maintained in DR popultion?
Is maintained fior unique allele solsey becvause of random over represenattion in founder event? – YES
NOT mainatined due to mutation-selection balance –> selection had nothing to do with the intial rise in frequncey (the man didn’t even have the phenoytype)
- The high prevemlance of this gene in Slainas (over 1/10 men are cariers) is just due to hustorical happenstance
Have kids with 4 women in small popultion ti start with –> that over representation is maintained even though popultion grew to sevral thousands of people
Founding events in DR
Because the founding
population was small, each
person had a large chance of
contributing disproportionately
to future generations –> One of these people, Altagracia
Carrasco, did have a
disproportionate effect and it
had consequences later on – Carrasco had children with at
least four women –and his genes
rose in frequency in the
population
Carrasco – He was also a carrier for a
mutation in the gene for 5-
alpha-reductase-2
Carrier for mutation in the gene for 5- alpha-reductase-2
- This version of the enzyme
functions poorly - Its role is the convert
testosterone to
dihydrotestosterone (DHT) - DHT is the version the hormone
that really triggers external
masculinization during
development - Result: XY individuals that are homozygous for this allele are born with female external anatomy - but switch and develop male genitalia at puberty
- The allele frequency in Salinas is
>0.1 - Over 1% of genetic males exhibit
this condition
Potential affect of drift
Drift can lead to increased frequencies of deleterious allles – in small popultions this effect can be dramatic
Negitive effects of founder events
Founder events can have negitive consequnceys for the populations
Use of Founder events
Founder effects can be useful in studying human genetics –> The effects of rare allles are hard to study in larger popultions even if they have large effects BUT rare alles can be studies in smaller popultions with higher frequncey due to found effects
- Reason populations on remote areas = hotbeds for geentoc studyes –> random increase in rare alelles = now frequcnet ebough to track raye and because they rise randomly to high allele frequencey because founder event
- Can now track genetic basis for rare traits
***Genetic mechanisms behind a number of human conditions have been identified because the alleles behind them have drifted to high frequencies is small isolated populations
- Can study mutations because the phenotypic affects that are usually uncommon occur more and more often because of founder effects
Remote islands with known
founding histories like Tristan da
Cunha have been hotbeds of
human genetic research for that
reason
***Remote islands with known
founding histories like Tristan da Cunha have been hotbeds of human genetic research for that reason
Why study popultions in remote areas
Reason populations on remote areas = hotbeds for geentoc studyes –> random increase in rare alelles = now frequcnet ebough to track raye and because they rise randomly to high allele frequencey because founder event
- Can now track genetic basis for rare traits
***Remote islands with known founding histories like Tristan da Cunha have been hotbeds of
human genetic research for that reason
Affects of bottle necks
Bottlenecks have similar effects as founder events – capturing random events that have long term consequences in popultion long after rebound
- Even if have complete rebound the effects still stick around
BUT NOW – not based on starting poplation NOW have existing popultion
Time to build varaition
It takes a lot of time to build up varaition –> If that is wpied out (like from bottle neck or founder events) = wipe out million of years of varaitionand will take millions of years to get that varaition again
***Strong drift in historical events can instantly undue the work of natural selection and thousands of generation’s worth of mutation accumulation
Example Bottleneck
Many the populations of many marine mammals have very low genetic diversity, even if current populations seem large and stable
Most marine animals had a really big bottle neck – Now they are OK but they had a bottle neck
Northern Elephant seals –
- People drive them almost to extciction – in 1880s ther were less than 30 left –> most of the breedng popultion was wiped out = almsot went extcit)
- NOW they are back to 100,000 but the strength of drift when in low popultion size sticks around –> NOW only 2 allels in mitocondria genome (loss of varaition) = low varaition because population crash
- Only the varaition from the small popultion gets passed down = ONLY get varaition from 28 inidviuals when small popultion size
SLIDES ON SEALS:
* In the 1880s (about 12
generations ago) they were only
20-30 left
* That bottleneck locked that
random variation in place from
there forward
* Now, those 100,000 individuals
only carry two mitochondrial
haplotypes
Effect of bottle neck or founder event
Low genetic diversity – that effect stays even after the popultion increases in size
- effect of drift when small popultion stciks around
*** Strong drift in historical events can instantly undue the work of natural selection and thousands of generation’s worth of mutation accumulation
- when in low popultion size sticks around loss of varaition = low varaition because population crash --> low varaition + effect of drift when small popultion stciks around - Only the varaition from the small popultion gets passed down = ONLY get varaition from 28 inidviuals when small popultion size
***Many the populations of many marine mammals have very low genetic diversity, even if current populations seem large and stable
Subpopultions
Many (most) species do not exist as a single continous gene pool – they are subdivided into seperate pools across space
- There are cases of species NOT in subdivsion popultions BUT most are
- They are vaiarble across space = doesn’t amke sense to treat as one gamete pool
Ex. Balck baers in upstate NY do not form a cohesive reproductive popultion with the black bears in florida
- Doesn’t it make sense to make one gamete pool across all North America because there are baers in florida that are not likley to mate with baers in adirondacls == have subdivisions populations of species
Differences in Subdived popultions
Subdivided = different = important to treat them seperatley
- Subdivided population can differ due to differences in selection pressures and the mutations that occur in one but not the others
- Difference in selection pressures
- Difference in mutations –> have different amounts of genetic varaition – it would be hard to get mutations from one subpopultion to another
- But they are also bound to differ from each other due to drift
- In subdivided populations, random changes are inevitably going to accumulate between them
- They are bound to have differences due to drift – error in one popultion = bound to be different than error in the other popultion = change the subdivided popultions to be difefrent even if they started at the same frequncey
***They are vaiarble across space = doesn’t amke sense to treat as one gamete pool
Ex. Balck baers in upstate NY do not form a cohesive reproductive popultion with the black bears in florida
- Doesn’t it make sense to make one gamete pool across all North America because there are baers in florida that are not likley to mate with baers in adirondacls – have subdivisions populations of species
Subpopulations + Drift
But they are also bound to differ from each other due to drift
- In subdivided populations, random changes are inevitably going to accumulate between them
- They are bound to have differences due to drift – error in one popultion = bound to be different than error in the other popultion = change the subdivided popultions to be difefrent even if they started at the same frequncey
Experimnet on Drift + popultion subdivision
Looking to see the number of populatioons that have the same allele frequncies
ALL start at p = 0.5; q = 0.5
- founding population of heterozygous Drosophila subdivided into 107 different
populations and maintained at constant population size for 19 generations (each generation they cut back down top 16 individuals)
- The frequency of the allele (which coded
for eye color – medielian traiot) was monitored at each step along the way (looked at phenoyype and could get ghenotype)
As they go through time the disrabution chnages
- Looked at frequncey class over each generation
END = get fixed for P = 0 or P = 1
- Inevtiable outcome = lose varaition and get foxation
- When hit point of fixation = other allele is gone for good
- By Gen 19 – over ½ of the populations had
become fixed for one allele of the other.
- Variation within individual populations is
decreasing
Was perfectley even between p = 0 and p = 1
- There was a 50/50 chnace at the start tp fix for one of the alleles (50% fix for p and 50% chnace fix for 1) and get 50/50 split between fixing for p or fixing for q = get a 50:50 avegragse alle frequncei across all of the popultions = close to p = 0.5
- Over all of the frequncies across all popultions get back to p = 0.5 – even thuough varaitioon within the popultion is lost
Variation within popultin is lost BUT varaition across popultions is maintained
Varaition within subpopultion vs across all subpopultions
Variation within popultin is lost BUT varaition across popultions is maintained – By producvt of probability based drift starting out
- Within popultion varaition decreases BUT maintain varaition across subdivide popultion
- Because stochastic = mainatin across popultions
Allele frequncey across all popultions = still around 0.5 – dirft decreased varaition within popultions but moantained varaition across popultions
But the probabilities of which allele goes to fixation work out such that genetic
variation in maintained among populations at the starting allele frequency
What happens when an allele goes to fixation
Lost from the population for good
Fixing for alleles in Dropshilla experimnet
We have 30 fixed for p = 0 and 28 fixed for
p = 1 (50/50) – The rest of the populations are uniformly
distributed in the across the rest of the
range
- Was perfectley even between p = 0 and p = 1 - There was a 50/50 chnace at the start tp fix for one of the alleles (50% fix for p and 50% chnace fix for 1) and get 50/50 split between fixing for p or fixing for q = get a 50:50 avegragse alle frequncei across all of the popultions = close to p = 0.5
Allele frequncey across all popultions = still around 0.5 – dirft decreased varaition within popultions but moantained varaition across popultions
- But the probabilities of which allele goes
to fixation work out such that genetic
variation in maintained among
populations at the starting allele
frequency
Drift + Subpopultion example – blue and red
Start with p = 0.4 –> and subdivide into 10 smaller popultions
- Porbability in each popultions = 60% blue and 40% red – same for all populations (smae proabbility for each one) MEANS 40% on average end red and 60% end blue fixation = IN THE END the avergae varaition will stay the same
- Varaition across popultions is mainatained BUT the varaition within popultions is miantained
- Axross subdivided popultions = have varaition
- Within = lose varaition (Fix for one allele)
- When divide into sub-popultions – each sub popultions strats as the same as sancestrla popultion – each sub popultion starts with 40% red and 60% blue –> Each subpopultion has 40% chance of fixing for red and 60% chance for fixing for blue –> Each will lose varaition because each will fix for an allele BUT each will fix for different alelle (Because stochastic) so across population mainatins varaition
- Since each have the same 0.4 chnace and 0.6 chance –> the most likley outcome is that 0.4 fix for red and 0.6 will fix for blue = maintain varaition across popultions BUT within popultions fiox for one allele (p is either 1 or 0) = lose varaition within popultsions BUT among popultions you still have p = 0.4 and 1 = 0.6 (Still have 60% red and 40% blue– becaus eexpect 60% of subpopultions to fix for red and 40% of subpopultions to fix for blue)
They will all fix for one allele = lose varaition BUT they will fix for different alles = across popultions they keep the varaition
- Each popultions has the same probability of fixing for red or blue allele at the start (0.4 chnace of red and 0.6 chance of blue) – the most likley outcome is that 40% of then fix for red and 0.6 fox for blue –> within popultions p will be 0 or 1 BUT among the popultions p will stay 0.4
Quantofying loss of variation within popultions
Heterozygosity
- Use the expected # of heterozygous NOT the observed number
Heterozygosity (H)
Metric for diversiity –> Genotypic frequency of heterozygotes in the population – It is the expected frequency of heterozygotes
given the allele frequency and assuming HWE
- Use the expected # of heterozygous NOT the observed number
- Allows us to quantify the loss of variation within populations
- Quantify varaition in terms of heterozygosity
***Drift = stochastic –> probability of chnage in alelle frequenceies will occir based on Start in popultions – FIRST = need to quantofy the loss of variation
H = 2pq – Heterozygosity = epxected frequncey pf heterozygotes given allele frequncdey (Measure of varaition in popultions)
Expected heterozygous
2pq
H = 2pq
IF p = 0.4 –> Heterozygosity = 2 X 0.4 X 0.6 = 0.48
H = 0.48
IF p = 0.15 –> q = 0.85
H = 0.255
Varaition in p = 0.4 vs. p = 0.15
There is more varaition in p = 0.4 than p = 0.15 – more even mix (cloer to 0.5 and 0.5) = more heterozygoues
- If no go fixation = heterozygotsity decreases = less varaition
P = 0.4 –> H = 0.48
Q = 0.85; P = 0.15 –> H = 0.255
Calculating Heterozygosity
H = 2pq – use epxcted number of heterozygotes
Example – If p = 0.4
Heterozygosity (H) = 0.48
^^ (2 X 0.4 X 0.6)
0.48 = Heterozygosity – gives alleleic varaition in popultions –> expected number of heterozygote sbased on allele frequncey
Where is drift inevitable
Drift = inevitable in population less than infantite in size – any popultions drift is always there
***Drift = based on Stnadrad error in gamete pool –> Smaller sample = more error
- Change then sticks –> get new allele frequncey
What is drift based on
Drift = based on standrad error in gamate pool – error that you don’t have the same allele frequncey
Inevitable outcome of drift
Inevitable outcome = loss of varaition over time –> Now chnage in varaition in gene pool for good
Lose within subpopultion varaition –> decreases within subpopultion
Maintain varaition acriss subdivided popultion – because stochastic = mainatin varaition acriss popultions
Example Calculating H – q = 0.85
q = 0.85 –> p = 0.15
H = 2 X 0.85 X 0.15 = 0.255
As allele frequncey os more polairized (closer to 1 and 0) = heterozygosity decreases
Chnage in H as P/Q changes
As allele frequcneies are more polarized (As p/q get closer to 0 and 1) = heteropzygosity decreases
- Peak of varaition = when have 50:50 (P = 0.5; q = 0.5)
When closer to even amount (Closer to 50:50) = increase varaition (Because increase in H and H is a measure of varaition)
P = 0.4 (more 50:50 split) = higher H = more varaition
p = 0.15 (more polarized) = Lower H = Less vraaition
H for 3 alleles
Sum of the expecte dfrequncey of each individual heterozygites – sum of the types of heterozygotes expected frequnceies
What can we do with Heterozygosity
We can look at how Heterozygfosity is distubuted
Peak Heterpzygotsisty
P = 0.5
How is Heterozygotsity Distrubuted
Peak = 0.5 –> Highest H value when we have 2 alleles in the population
- IF we start with a popultion with all Heterozygotes –> expected frequncey stays the same
IN a two allele system expected Heterozygotsity is highets (Most amount of varaition is highest) at p = 0.5
Expected Vs. Observed for H value
F we start with a popultion with all Heterozygotes –> expected frequncey stays the same
- If have all heterozygotes the observed H mighy be 1 BUT we use the expcted number of Heterozygotes = H will still be 0.5 (Even if observed is 1 – if we strat with all Aa theb p = 0.5 –> then expected is 0.5 – H is 0.5 NOT 1)
- Use expecetd values for H NOT observed
Relationship Between Popultion size and Decrease in H
Relationship is based on average decline in H between
generations at a given population size – not
deterministic
- We can acalculte the average expected decline in H between popultions– calculating the most likley H (pribabilistic)
- Shows how popultion shize affects varaition –> Knowing that in the end we will always lose varaition
Hg+1 = Hg [1 - 1/2N] –> Gives chnage in H in next generation
- Hg = Heterzygosity of current genertion
- Hg+1 = Heterozygosity in next generation
- N = popultion size
***ONly parameter is popultion size
Results:
As N Increases = Avgerage chnage in H decreases
- As N Increases = 1/2N gets small = less change in H over time because you are closer to being 1 (if 1/2N is smaller – 1-1/2N stays closer to 1 because you are substracting a smaller numebr from one) – If 1 - smaller number = 1 Times the current H = stays close to current H
***We are likley to decrease in H from one generation to the next
IF go from p = 0.6 –> p = 0.5
- H is increaseing BUT avergae trend will be to lose varaition ober time – on avergage = lose varaition –> Losing varaition is the most likley outcome for each generation
- Doesn’t mean that lsoing varaition is what will happen
- We know we will lose varaition BUT we don’t know what it will be but we can predict what is most likley
Looking at Change in varaition (Chnage in H over time)
IF go from p = 0.6 –> p = 0.5
- H is increaseing BUT avergae trend will be to lose varaition ober time – on avergage = lose varaition –> Losing varaition is the most likley outcome for each generation
- Doesn’t mean that lsoing varaition is what will happen
- We know we will lose varaition BUT we don’t know what it will be but we can predict what is most likley
Why 2N when looking at chnage in H
Because we are thinkning about alleles NOT individuals –> The individuals in popultions are diploid = 2N for gamete pool
How does Popultion size affect H
As N Increases = Avgerage chnage in H decreases
- As N Increases = 1/2N gets small = less change in H over time because you are closer to being 1 (if 1/2N is smaller – 1-1/2N stays closer to 1 because you are substracting a smaller numebr from one) – If 1 - smaller number = 1 Times the current H = stays close to current H
As population size increases, the average change in heterozygosity decreases (1 minus a smaller fraction)
Example - Calculating H in the next generation
Start with H = 0.5
N = 1000
Hg+1 = Hg [1- 1/2N]
Hg+1 = 0.5 [1 - 1/(2 X 1000) = 0.49975
- HERE = we get a number that is slightley less than 0.5 – expect small Sampel errow making 1000 zygotes = lose a little genetic varaition but not much
We are likley to decrease in H from one generation to the next
SHOWS – as you decrease N = Larger average chnage in H
- At N = 1000 --> have small change in H
THESE = NOT showing the actual outcome – it is shwoing the Avgerage oucome across all possibel ouctomes
- Avergae H from Probability N Zygotes –> Avergae their H
What does Change in H equation actually show
NOT showing the actual outcome – it is shwoing the Avgerage oucome across all possibel ouctomes
- Avergae H from Probability N Zygotes –> Avergae their H
Using Change in H equation for N = 1 – Can we have N =1 AND Can you have H = 0.25 in 1 individuals if have 2 alleles
Can’t in humans BUT we can in other organisms – plants do it (have self-fertalization)
Can you have H = 0.25 in 1 indivdiuals if have 2 alleles –> NO – BUT the H we calaculate is the averahe outcome NOT the actual outycome
- SInce diploid –> single individual – H = 0.5 –> H = 0 – THEN have 0.5 + 0/ = 0.25 —> Have will be 0.5 and 1/2 will be 0 and take average
Extending Chnage in H across generations
We can look at the effect of drift through time
- Before = we were just looking at one generation
- Heterozygostity is lost after many generations (Alleleic shake up but to drift over time)
TO make it for across generations = Had T expeonent – add t to generations
HERE = we assume that the popultion size is the same across multiple generations
Hg+t = Hg [1 - 1/2N]^t
t = numbver of generations
Example – Starting H = 0.5
Hg+t = 0.5 [ 1 - 1/ (2X 100)]^10
Hg+t = 0.4755
We expect a >0.1 shift across time – average change in allele frequnceies = >0.1 across 10 generations due to drift itself
- Affect of mutations = needs more gernations to have allele freuqncey chnage BUT gere = have substantaial frequcey change
HERE Even with a population of a 100, over 10 generations, we’re
expecting an average allele frequency shift of over 0.1
Affect of mutation vs. Affect of drift
Affect of mutations = needs more gernations to have allele freuqncey chnage BUT With drift = have substantaial frequcey change in shorter amount of time
With drift – even with a population of a 100, over 10 generations, we’re
expecting an average allele frequency shift of over 0.1
Predicting how genetic varaition should decay with popultions over time
We can graph how popultion’s genertic varaition should decay within popultions over time –> We can make a graph tp make predictions about how genetic varaotions hould decay over time
GRAPH –
RED = actual data
Dotted line = prediction for avergae H chnage in 19 generations for popultion of 16
- Prediction of 16 – Expect loss in h at the rate in the graoh
THEN confront the prediction with data from experiment
These two lines are slightley different – because dirft is probabiolistic
- The prediction for 16 doesn’t hold up that much
Results: H Decreased faster than expected given the popultion size of 16
- Some b ehaved as expected (fized for P = 0 or P =1) –> overall P = 0.5 across the generaytion
- BUT the loss of varaition happened faster than expected – the data matches a theroetical prediction based on population size of 9 ratehr than a popultion size of 16
- KNow that thyere were actually 16 indiviuaks in each generation but drift axross genertaion occured fatser than exoected
DRIFT = faster than cencus popultion –> This is very common
Science
Confronting models with data from the real world
How did they get “epxected line” in drosphilla experiment (Looking at change in h over time)
We can make a plot that shows the expected decline in genetic diversity through time for a given popultion size
Plots showing the expected decline in genetic diversity through time for a given popultion size
Gives expected line in expriements – when look at real data vs. expected data for chnage in H over time
To get the line = need to have N the same (Keep the same popultion size BUT change t)
Plots the decline in H iover time
Keep N the same – only chnage T
ONce find data = can plot the points – gives expected line
- Do prediction for how popultion will behave then test the hypothesis
Rate of drift
Often occurs much fater than epxcted (like in dropshilla experimnet)
- Populations often druft faster than we would epxect based on their true popultion size – instead they dirft at the rate of a smaller population
Drift = often occurs faster than censeus popultions – common –> often occurer faster thna expcted based on popultion size itself
- Because often dirft is based on teh effective popultion size rather than the true/cecus size
Cesusus vs. effective popultions size
Census – actual popultion size – true popultion szie (Ex. in dropshilla experiment = 16)
Effective popultion size – The popultion size expected to match the realized rate of drift
DRift = based on Ne (Ne is smaller than N)
Effective popultions size
The popultion size expected to match the realized rate of drift
- Ne = # of indiviuals actually reproducimng in popultion
- Actual indovdiouals contributing to the gene pool that we are misampling from
DRift = based on Ne (Ne is smaller than N)
Example –
Efefctive popultion size in dros[hila expeirment = 9 – only 9 breeding indidviaiuls on average –> 9 indidvuals on average reprdoucing (7 do not)
- 16 = cencus size
- 9 = Ne
Why don’t popultions actually drift at the census popultion size
Because of 3rd postulate of NS –> Varaition in popultion ins S/R even if unrelated to phenotypes/genes
- JUst have varaition in S/R (doesn’t need to vary at as a function of a triat]
If dirfting at N –> All indiviaual probability of S/R = 1.0 – dorft at N = drifyoing at the actual counyd of Indiodvuals (Census popultion size)
- Dirfting at N = everyone has probability of 1
- Phenotype is not affecting proibabiliyu
Vs.
Drifting at Ne (Effective popultion size)
- HERE = 60% chnace of reproduction –> NOT per phenotype (Still uniform distrubution across all phenotypes) BUT all have probability of 60% = not all S/R
- Actual breeding popultions = NOW subset of overall popultions –> NOT all repriduce = # of indiviualks reproducing = Ne
DRift = based on Ne (Ne is smaller than N = more dirft = drift occurs faster in most popultions than epxected)
- Because variation in S/R = drift occurs more powerfully than if it was just based on census size alone
N vs. Ne
Comes back to our nearly universal postulate for natural selection: not everyone gets to survive and reproduce (even
when that’s unrelated to genotype)
Not everyone participates in reproduction (without regard for their phenotypes or genotypes)
- Different than selection – because in selection not everyone contributes to the next generation BUT it is due to their traits – HERE = the probability of participation is uniformly distrbuted just at a level lower than the actual popultion size
- Here = not varying as a fucntion of genotype/phenotype
Because variation in S/R = drift occurs more powerfully than if it was just based on census size alone
Calculating Ne
We will just look at sex ratio –> Sex ratio makes a big difference in effective popultion size
Equation – Ne = 4NmNf/(Nm + Nf)
What affects Ne
The sex ratio of individuals participating in reproduction makes a big difference in effective population size = affects the rate of genetic drift
- Big skew in sex ratio = affects N vs. Ne
- N vs. Ne = very affected by if there is a big difference in males vs. females = affects the rate of drift
Things that affect sex ratio
- Polygamy – One male + many females
- Very common in nature
- Polyandry – One female + multiple males
- Less common in natire but sometimes is still very important
Throughs off the sex ratio = affects the rate of genetic drift
- Less common in natire but sometimes is still very important
Exampe of calculating Ne
Example – rate of drift in popultion starts the same
In mating popultions 1 Bull = mates with 25 females
If have 500 in popultions –> 250 femnales – if have 1:25 mating ratio –> only 10 males mate
250 females (50:50 sex ratio) –> only 10 males mate
- IN mating popultion only have 10 males and 250 females – sampling acros 250 females + smapling across 10 males = very little varaition
Equation:
Ne = 4NmNf/(Nm + Nf)
Nm = # of mating males
Nf = # of mationg females
Ne = 4 X 10 X 250/ (10 + 250) = 38.5
- 38.5 = much smaller than 500 –> Sex ratio throughs off Ne a lot
***Sex ratio throughs off Ne a lot
N vs. Ne affect on H
If N = 500 (much bogger) = varaiotion stays for a while
If N = 38.5 (Ne = 38.5) = much more drift occurs = lose varaition much faster
Example of drift on Ne (affected by sex ratio)
If we chnage ratio to 1:5 –> NOW have 250 females and 50 males
Ne = 166.78 – Still much less than 250
***STill a serious concern for conservation
Effect of drift across subdivided popultions
We expect genetic variation to decrease within
populations, but not among populations – we can compare those
two levels
How do we quantofy the effect of drift across sub popultions
Compoare the within popultion varaition to the between sub popultion varaition
- Look at varaition iwthin popultions vs. varaition between popultions to see how much drift has gone on
- We can quantify the degree of genetic differentiation that drift has
caused between these populations by examining heterozygosity
within and between them
Compare Observed vs. expected value to see how effective dirft i
Example – Affect of drift across subsivided popultions
Example – eaisest to look at 2 sub popultions of the same species
- Look at the varaition loss within the popultion and compare that to the exoected between popultions
- Look at varaition iwthin popultions vs. varaition between popultions to see how much drift has gone on
We can quantify the degree of genetic differentiation that drift has
caused between these populations by examining heterozygosity
within and between them
Start = find H within each popultions and take avergae of H of the two popultions
- Start by calculating the expected Heterozygosity within both
populations and taking the average
Population 1
H = 2pq
P = 0.35
H = 0.455
Population 2
P = 0.8
H = 2pq = 0.32
THEN avergae varaition within the popultions – take Average of both H
H pop 1 + H pop2/ 2 = 0.455 + 0.32/2 –> Hs = 0.38756
Hs = Average H across the two popultions
THEN – We need tp turn to finidng the expected H if this was one big connected popultion –> Do so by combing the allele frequncies and Calculate H as before
- Now we need H if one popultions –> Exopected H based on the avwerage allele frequcnies across the popultions
- Find H if you take an avergae of the allele frequncies
- If we assume equal popultion size = combined allele frequencey os just the average of the two
Find Average P = 0.35 + 0.8/ 2 = 0.575 –> combine the alelle frequncies if in one popultions
- Average alelle freqincey (assuming that they have teh same p[opultion szie) – get alelle frequncey if they were one popultion
- Means that if all of the indiovdiuals belongs to the same big popultion – the allele frequcney would be o = 0.575 –> now we can use this to calculate the expected H across all popultions
THEN find H with teh Average P –> Ht = 0.489 (2 X 0.575 X 0.425)
- Gives you the expected H across all popultions
THEN – we can use these numbers to quantify the degree of magnbitude of genetic difference between the popuotions using FST index
- Now look at the difference between the two
- Divide by the total amount of varaition in popultion
Hs = should always be equal to or smallter than Ht
FST = Ht-Hs/Ht
FST = 0.489 - 0.388 = 0.489
FST = 0.207
Hs Vs. Ht
Hs = for sub popultion (Find each H for each popultion and take Average of H)
Ht = H in total popultion (Suing Average allele frequceny)
- Find avergae of P or Q and then find H based on Average P or Q
Example calauclating FST
Populations A
- Have no varaition within the popultions BUT compare to the varaition between popultions
- HERE – no varaition within popultions –> ALL varaition is between popultions
P1 = 0
P2 = 1
Hs = 0
Ht = 0.5
- All of the varaition is between the popultions – none of the varaition is within the popultion – all of the varaition is exists between popultoons
FST = 1
Population B
P 1 = 0.52
P2 = 0.48
HERE – Most of the varaition is within popultions –> there is little vraaition between popultions
- Here there is the same amount pf varaition within as there is between popuoltions
Hs = 0.4992
HT = 0.5
FST = 0.0016
What does FST show
FSt = indicates how close the popultions are from having drifted to fized differences from each other
FST = between 0-1
Meaning of FST
FST = 0 –> No difference in allele frequncey between popultions
- - Most of the varaition is within popultions –> there is little vraaition between popultions
- Can teat as one population – little variation between subpopulations
FST = 1 – Popiultion is fixed for allele = ALl of the varaition is between popultions
- All of the varaition is between the popultions – none of the varaition is within the popultion – all of the varaition is exists between popultoons
Why find FST
FST = used very often –> Why calculate it
Example 1 – baer survivorship population
- If we are hunting baers in Florida = we might wnat to know how they connected they are to baer popultion in other places –> if have differences = might have to say they are different
BUT if FST = 0 –> They you can treat them at one population
- Little genetic varaition between them = treat as one popultion
Example 2 – Used in epidemialogy –> Have a certain parasite that causes disease
- Have a parasite that causes diseae + a vector (Parasite gets to misquitos and masiquiotes infect people) –> If want to control the disease = might need to think about popultion of parasite + vesctor popultion
- Might have 2 popultions a a new variant of drug resistant nematodes –> need to know how worries are you that they will go to the other popultion – might look at genetic differences between popultions + look for the same thing for masquitos
- Quation = are they sepearet or is the FST LOW – Look to see IF the two popultions are genetically differemt
IF FST is low = the popultions are more in contact with each other
Drift in Large vs. small popultions
Drift = very strong evolutionary force in small popultions – BUT in large populions ir may still be important over long time scales driving a constnat chnage at nuetral loci
***Drift = largley driven by popultin size
- Different popultion size = affects the rate of drift
- Smaller popultions – drift occurs more rapidly
Can we discuss drift as a mechanism in larger popultions
For a long time biologicsts thought No because genetic affect in large popultion is Subtle but not the case –> In any non-infinate popultion dirft occurs in background and see affect in the genome –> We know becayse we see change at. nutral loci
How do we know drift acts in large popultoiomns
We know because we can see chnage in nuetral loci
***Drift in large popultions dirves constant change at nuetal loci
If only selection = should only have to do with things that affect fitness but since doesn’t affect fitness = can’t be selection = we know drift cuases chnage in large popultions
What do we mean by Nuetral
Neural means selection is nuteral – not associated with fitness = no selection
- By nuetral we mean alelleic chnages with no fitness consequcnes – not everything has an efefct on fitness –> Many phenotypes (even gentic oenes) = doesn’t affect lifetime reproductive sucess = close to nuetral in popultion
- Not everything matters for survival and reproduction – many phenotypes (even ones with strong geentoc compeonents are neutral) – espcailly in modern humans
Example – Height –> is the number of hids any one person is going to have vary as a function of hieght – NO = nuetral trait
- If only selection = should only have to do with things that affect fitness but since doesn't affect fitness = can't be selection = we know drift cuases chnage in large popultions
How do we know variation is Nuetral?
Chnages to DNA sequcne that do not affect peptide strucrture in coding region – have synonmous codon chnage
- In codon table –> Most amino acids (Exceot Trp and Met) = are readundant (Multiple codes specifiy the same amino acid)
- Know it is nutral when change in DNA but have the same Amino acid = same Primary structure = same protein = no affect on phenotyoe
There are a lot of varaints – most varaints that are like this –> Can’t affect phenotype = NS can’t act = chnage in ferquncey MUST be due to drift
What types of varaints do we know are nuetral
Varaints that are nutral at the molecular level – with drift = focus on neutrality at the molecular lebvel
They found this in 1960s – realized once they looked at varaition on molecular level –> Sequcne proetin and then got DNA varaints –>
HERE = there is no domiant or recssuve because no affect on phenotype
What did they find with Nuetral varaints
They found this in 1960s – realized once they looked at varaition on molecular level –> Sequcne proetin and then got DNA varaints –> FOund that there was a lot of varaintion = At odds with the idea that NS drived all the varaintionn bevause the bulk of the varaition is nuetral = NS cam’t act on it
How did they realize that drift is important in large popultions
The idea of long-term drift in large populations being an important evolutionary forces came about when
were able to start observing variation at the molecular level – and some of it
seemed odd if differences were driven by selection
What is drift a function of?
Drift = Function of population size
a larger population = expected allele frequency change = weak force in larger population –> Why should we care in a large population
Why should we care about dirft in a large popultions
For a long time they didn’t thnk drift was important in a large popultion THEN when looking at the molecular level = they found controdictions to idea that drift isn’t acting in large popultions
When looking at the molecular level they found that most varaition within/between species = selectivley nuetral –> no affect on phenotype = can’t affect surviva/reproduction = not chnaging due to natural selection = due to something else causinhg nuetral to change
What did they find about drift when looking at the molecular level
When looking at the molecular level they found that most varaition within/between species = selectivley nuetral –> no affect on phenotype = can’t affect surviva/reproduction = not chnaging due to natural selection = due to something else causinhg nuetral to change
What do we focus on with drift
With drift we focus on neutrality at the molecular level
How do we know that drift is importnat in large popultions
The idea of long-term drift in large popultions being an important evolutionary force came about when we were able to start observing varaiation at the molecular level and some of it seemed odd if difefrences were driven by selection
What kind of mutations are nuetral
Synonymous mutations – nutral mutation –> means you get the same Amino Acid
Mutation were if you change nucleotide sequence = can still specify the same amino acid = the secondary and tertiary structure is the same = function is the same = change to DNA that can’t affect genotype
- Still herediatry change in coding region
Genetic code
Genetic code = redundant –> can make Amino Acid by specifying more than 1 codon –> if you change nucleotide sequence = can still specify the same amino acid = the secondary and tertiary structure is the same = function is the same = change to DNA that can’t affect genotype
Most varaition within and between species
Sequence genome – variation = mostly changes that don’t affect phenotype –> changes that don’t affect Amino acid = don’t affect phenotype
A lot of varaition in protein coding regions = synonomous mutations (Within and among species)
Issue with nuetral varaition
How would this varaition build if Natural selection can’t see it (no difference in S/R if can’t make a phenotype)
ANSWER: due to druft
How does nutral variation build
Issue with varaition = how would neutral varaiation build if NS can’t see it
ANSWER – can build due to drift
- Change is due to drift NOT due to Natural selection
Idea for how NS can build in large popultion
Thought that varaition can build due to drift in large popultions id all popultions had bottleneck events over time
ISSUE = this doesn’t scale up with what we see in nature (not seen in fossil record)
Non-synonmous mutations
Change in code that leads to different amino acid
Rate that neutral varaition builds
Constant rate – Nuetral varaition appears to build at a constant rate
Varaition builds in clocklike reptative rate
Evidence for drift building nuertal varaition
Varaition builds in clocklike reptative rate –> Variation builds in regular constant fashion = suggests it is NOT bottleneck
How do we know nuetral varaition is not due to bottle neck
- Not seen in fossil record
- Does not match the clocklike fashion that we see
How do we get clockwork constant rate of change in nuetral
Done through nulceotide substitution (Mutation + Substitution) AND drift
Some mutation goes to fixation because of drift
Mutation and Substitution
New varaition enters popultion as mutations but its converted tp varaition between popultions or species by substitition
- Some mutation goes to fixation because of drift = mnucelotide substitution
- Some new mutations randomly drift to fixation
Substiution = fixation of one allele in plavce of other
How does mutation go to fixation
Because of drift (In mutation and substitution)
Mutation and substitution example
Start = all yellow –> THEN have a mutation that drives a new alelle = get green
Green = new variation
- When you get the green eacvh copy of each allele has equal probability of going to fixation due to drift -
THE green = can go to fixation
HERE = have mutations + substitution –> When get all green DUE TO DRIFT = have substitution event – no longer have the ancestral yellow only have the new green
- No one allele is favored just have random change –> can get fixation of the new mutation
Substitution
Fixation of one allele in plavce of other
If drift is only strong in small populations why should this process occur constantly in large populations
Strength of drift = varies as a function of popultion size BUT mutation rate doesn’t vary BUT the number of mutations does vary with popultion size
- Number of new nuetral alleles in each popultions = 2nv
Number of mutations increase as N increases
Probability of any one alelle going to fixation = 1/2N (each alelle has equal probability)
- 2nv = increases with N
- 1/2N = decreases with N
As N increase (popultion size increases) = probability of any one allele going to fixation decreases
- As N increases = have more mutations but each one is less likley to drift to fixation (probability of going to fixation decreases as N increase)
MEANS that the probability of getting a mutations AND the probability of that mutation going to fixation:
2Nv X 1/2N = v – 2N cancles out = just get v as drive of nuertal substiutution in generation
- Each generation the number of new mutations that are destined to become substititions –> 2nv X 1/2N = v – population size cancel out completley
Popultion size affects the affect of drift + the affect of number of mutations –> decrease of affect of drift = counteracted by number of mutations that as v increases as N increases – effective population countercat of net result
In small popultons = each allele has a higher chance of going to fixation VS. In large popultions more alelles come into existances –> The effect of popultion size balances out perfectley = only based on nuetral mutation rate = can occur in larger popultions
Number of new nuetral alleles in each popultions
2nv – get the number of new mutations in generation (Mutation rate times the number of gene copies in the population)
v = rate of mutation
N = number of allele copies
What is the drive of nuetral substitution in a popultion
v = drives of nuertal substiutution in generation
What does popultion size affect
Popultion size affects the affect of drift + the affect of number of mutations –> decrease of affect of drift = counteracted by number of mutations that as v increases as N increases – effective population countercat of net result
What is neutral substitution driven by?
- New mutations
- Drift
BOTH vary as a function of N
What drives clocklike rate
Small populations = make new alleles but higher probability of going to fixation
Expected rate of nuetral substitutions
The expected rate of nuertal substibution by drift is equal to the mutation rate
In small popultons = each allele has a higher chance of going to fixation VS. In large popultions more alelles come into existances
The effect of popultion size on nutral substitution by drift rate
In small popultons = each allele has a higher chance of going to fixation VS. In large popultions more alelles come into existances –> The effect of popultion size balances out perfectley
What creates clocklike rate
The effect of popultion size balances out perfectley = rate is ONLY based on nutral mutation rate –> if we assume that the mutation rate stays constant over time then this shjould result in constant clocklike rate of nuetral evolutionary chnage
- Mutation rate = the same through time = expect to build constantly over time = get constant rate
What is the basis for clocklike rate
Relationship with time since ancestor + rate of mutation building
Looking at clocklike rate
We can look at time of divergenge using more data –> look at time based on mutation –> drift change builds in constant clocklike way
Building of nonsynomous mutation rate
Because nonsynomous mutation have effect on fitness = we expect far fewer of them to be neutral BUT some are nuertal and we expect these substitutions to accumulate too just at a slower rate
Rate = gives us a null expectation for coding region differences should accumulate in the genome
Synonmous vs. nonsynomous rates
Synomous = nuetral
Nonsynomous = can be efefctvley nuetral (might not change protein in important way = doesn’t affect phenotype) BUT the nuetral ones occur at a lower rate than synomous
Expect synomous and non-synomous to build at different rates
- Non-synomous rate = lower – get more synomous per unit of time than non synonomous because all synomous are nuetral
Looking at the rates = allows us to build a null model for expevctations for coding diferences should accumalte in the genome
- Can see if constsint change is based on idea that something is driven based on dirft – gives evidence that NS is driving change
Rate of nuetral non-synonmous
Expect to have a lower rate of nuerral non-synomous mutations than synomous mutations because some of the non-synomous have an effect
Why look at rates of synomous vs. non-synomous
Looking at the rates = allows us to build a null model for expevctations for coding diferences should accumalte in the genome
- Can see if constsint change is based on idea that something is driven based on dirft – gives evidence that NS is driving change
- At a given amount of accumulated neutral differentiation (amount of time separating species) - we should have an expected ratio of synonymous to nonsynonymous substitutions across the genome –> Selection favoring non-synonmous mutations should throw things off this expevctations (we can infer that patterns are the result of positive sel;ection by idetofying deviations from this ratio with higher than expected non-synonmous substitutions ratyes being driven by NS)
- We expect this ratio to be pretty similar across the genome for sites that are evolving neutrally (with synonymous substitutions outnumbering nonsynonymous ones)
IF not drift ratio of synomous to nonsynomous = allows us to have evidence that NS is driving change
When looking at the rates = can get ratio of synomous vs. nonsynomousover seperating over time diverging = deviations form ratio = indicative that natural selection is driving chnage
- Deviations from ratio = chnage is not due to drift = NS can be in process
- (we can infer that patterns are the result of positive sel;ection by idetofying deviations from this ratio with higher than expected non-synonmous substitutions ratyes being driven by NS)
Example – looking at synomous vs. nonsynomous mutation ratio
Example – BRCA1 gene (Gene heavily implicated in breast cancer) –> Appears to have been under string postive selection relativley recetley – because the ratio of Non-synomous:synomous is > than 1.0
- Looking at the ratio – shows that most of the ratios are below 1 for most species until humans and chimps
- Humans and chimps = have many non-synommous mutations compared to expectation of drift –> chnage = due to natural selection in common ancestor of humans and chimps
Change see Natural selection is occuring nased on deviations from ratio based on drift alone = NS acted to shape varaition
Example 2– looking at synomous vs. nonsynomous mutation ratio
MCR1 gene in plumage color –> see 1:1 correlation between degree of sexual dymorphism and degree of natural selection acting on the gene
Why do we use years and not generation time for mutation
Because of population size and body size of organisms