Phylogenetics Final Flashcards
systematics
the inference of phylogenies, the genealogy of species, focus on the species tree (reconstructing lineage history)
coalescence
point of common ancestry of two alleles
they both come from the same parental allele
lines of descent in diploid sex pop
the most recent at the top and goes back in time at the bottom
the graph is a section of the genome
the pairs a individual genes
the circles are alleles of the gene
Assumptions of diploid sex pop allele descent
- equal probabilities of alleles being passed from one gen to the next (no selection, random mating)
- population size is constant over time
- alleles (from gen t) are drawn randomly with replacement from the previous generation (t-1)
probability of coalescence
given N diploid individuals in each generation, the probability that 2 alleles coalesce in the previous generation (t-1) is 1/2N.
it is just the random probability with replacement.
time to coalescence
the expected time to coalesce for two alleles is geometrically distributed with a mean of E(t) = 2N generations. N is the number of individuals in a generation.
coalescence is slower in…
larger populations compared to smaller populations
expected time to coalescence for many alleles
4N generations
properties of coalescent trees in a constant population
- coalescence is rapid with many alleles, decreases over time as n decreases
- have long trunks and short terminal branches
recombination causes
different genes in the same individuals to have different gene tree topologies
recombination in two ways
- independent assortment of individual chromosomes (the single strand) allows chromosomes to pair with those from the other pair
- crossing over between two chromosomes allows parts of the chromosomes to swap locations. how recomb can happen on a single chromosome.
recombinational gene
a block of adjacent nucleotides that share the same gene tree
this is the idea locus for phylogenetic inference
branching or splitting
the subdivision of an ancestral population by barriers to gene flow
population tree
a tree that contains many branching populations. all of the gene trees are embedded in this tree
reasons gene trees dont match pop tree
- deep coalescence
deep coalescence
also incomplete lineage sorting
alleles fail to coalesce at their species tree point and instead have coalesced earlier in time, before the ancestral polymorphism.
ancestral polymorphism
the mutated trait that led to the split of the species
pop tree length and width
length = generations (time)
width = effective population size (idealized population with same coal. props)
combining both:
coalescent unit = 1 unit is 2N the expected time to coalescence
long and narrow, coal more likely
short and wide, coal less likely
deep coalescence frequency of gene trees
that the major (most frequent ) topology of gene trees will match the population tree and the minor (less frequent) topologies are randomly discordant and are equal in frequency.
what is a phylogeny debate
- phylogeny as a cloud: phylogeny is a statistical distribution with a central tendency but variance because of the diversity of gene trees that are all included
anomaly zone
where the population tree does not match the most probably gene tree
in pectinate tree if internal branches are very short, all 4 taxa coalesce before first split, then the 3 symmetric gene tree possibilities are more probable than the pectinate tree gene tree
due to pectinate trees only having one possible coalescence and symmetric has 2 possibilites (in a rooted 4 taxon tree)
pectinate tree (unbalanced)
tree with each taxon being individually sister to the rest of the tree
only one possible sequence of coalescence
symmetric tree (balanced)
the clades split equally
has two possible sequences of coalescence:
- one sister group coalesces and then the other
- vice versa
anomalous gene tree (AGT)
a gene tree that is more probable than the pop tree