Module 3 Flashcards

Question

What is an autoapomorphy?

Answer 1

Singletons, changes in the sequence of one species that can't help understand relationships, but they can help understand the rate of change

Answer 2

They use a measure of similarity to group different OTU's together and then pair those individual groups to each other to form a hierarchy. Are fast and computationally easy Reduce character states down to distances Assumes a constant rate of change/evolution Based on "Observed Distance" ("p") MAY NOT REFLECT THE "ACTUAL GENETIC DISTANCE" "d" BLAST uses this approach Often a good first approximation of the data

Answer 3

Within Clustering algorithm: 1. UPGMA - Unweighted Pair Group Methods with Arithmetric Means (Clustering OTUs then clusters those new groups) 2. Neighbour Joining, NJ, Shortest tree - Sequentially finds pairs of neighbours connected by a single node, aims to reduce the overall length of a tree Within optimality criterion: 3. Minimum Evolution, ME, Shortest branch length - Reconstructs the tree with the shortest branch length, minimum distance

Answer 4

The observed distance can underestimate the true distance, especially when the degree of divergence is high (e.g. old splits) Multiple substitutions accumulate and sequences become random/saturated

Answer 5

The 3rd codon position has less impact on the outcome of mutations (it often doesn't change the amino acid) and therefore it changes faster than the 1st and 2nd codon positions and should be treated differently when modelling evolutionary change

Answer 6

Transitions between the same type of nucleic acid are more frequent Transversions between different kinds are less frequent

Answer 7

WHich model you use will affect the outcome of your analysis 1. Jukes-Canter model (JC69) All changes are equally likely, and we assume equal frequencies of all nucleic acids 2. Kimura model (K80) Transition rate (between similar nucleic acids) differs from transversion rate (between different) and we assume equal frequencies 3. Hasegawa-Kishino-Yano (HKY85) and Felsenstein (F81) models Ts and Tv rates differ and we assume unequal frequencies 4. Tamura-Nei model (TN93) Ts and Tv rates differ and we assume unequal frequencies AND Transistions between purines (A/G) and pyrimidines (C/T) differ 5. Generalised Time Reversible model (GTR) or T86 after Simon Tavaré "Whole hog", seperate rates for every single transition or transversion

Answer 8

1. Maximum Parsimony 2. Maximum Likelihood (Hereunder Bayesian methods)

Answer 9

It evaluates alternative trees based on the character data, compares the number of changes and selects the tree with the fewest changes

Answer 10

(Mostly applied to non-molecular datasets and Pete does not usually apply this to larger datasets) - Complex morphologies must reflect homology - Evolution is rare THis means that Parsimony does not tell us which tree is more likely to be true, but which tree is the simplest Ignores multiple hits Generally not used beyond initial distance trees!

Answer 11

Instead of comparing the tree to the data we're comparing the data to the tree. Ideally, this method takes your data and your evolutionary model (the one you pick) and then tells you what the probability is for each possible tree given the data. Now, because there are so many possible trees this is not actually what happens. (Pr(H I D) where H is the tree (and the model) and D is the sequences.) INSTEAD what does happen is Pr(D I H) or the probability of the data given the tree (and model). We then prefer the tree with the highest value/likelihood

Answer 12

The probability of the data given a particular model of evolution Reported as Natural log likelihood or log-Likelihood score (negative value written as -lnL) Closer to zero means better fit of the data

Answer 13

1. Alignment of the data 2. Generate a tree and compare the aligned data to it (we're also asking which model of evolution is best in this step) 3. Optimise the likelihood of this tree 4. Rearrange tree to generate new tree 5. Compare, does new tree have higher likelihood than old tree? 6. Yes: Keep tree or No; Keep old tree 7. Keep going until you can't find tree with better likelihood

Answer 14

Advantages: - Reliance on explicit model of molecular evolution - adaptable - Results are conditional on model - Test different models - Likelihood ratio test Disadvantages: - Takes a long time - iterative (but CPUs are getting faster) - More sequences, more problems

Answer 15

Bootstrap support tests the strength of a phylogenetic signal in an alignment by reshuffling your data to see if you get the same result Resamples your data to make alternate datasets known as pseudoreplicates, so the data has the same number of sites but different sequence. If you get a different topology the first tree has marginal support The bootstrap value lets you know the number of times out of 100 or 1000 the same node was recovered. Above 90 and close to 100 is what we want, but above 70 is acceptable

Answer 16

1. Nearest Neighbour interchange (NNI) 2. Subtree Pruning + Regrafting (SPR) 3. Tree-Bisection + Reconnection (TBR)

Answer 17

"Hill-climbing"/heuristic approach, it keeps climbing the likelihood "hill", can't step down and can get stuck on a local optimum in tree space.

Answer 18

Metropolis coupled Markov Chain Monte Carlo (MC^3) Markov Chain Monte Carlo (MCMC) - A set of algorithms that walk randomly through tree space - Markov Chain = Movement through states (Not influenced by past states, i.e. trees) - Monte Carlo = Random sampling of numbers Metropolis-Coupled Cold chain robot = The one that's actually looking for the best tree Hot chain robot = Helpers that scout through tree space, can jump downhill and inform the cold chain if it finds a better area.

Answer 19

The main feature of Bayesian Statistics/Phylogenetics is that it takes into account prior knowledge of the hypothesis (the tree). Prior information: - Tree topologies - Each branch length - Substitution rates/Model of evolution - Rate heterogeneity parameter - Nucleotide frequencies We can either give it realistic/informative prior information (stuff we know before investigating the data) OR flat prior information (which is kind of like giving it no information, all parameter values within a bound, like 0-1, are equally likely) Bayes theorem - Describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Answer 20

The beginning of the MCMC analysis where it's trying to find a good area in tree space You have to let the analysis run for a couple of generations so "the robots" can converge on the same area in tree space. The final results will then discard X amount of generations as "burning" (The first million steps)

Answer 21

After the Bayesian Phylogenetic analysis we end up with a posterior distribution of tree topologies and branch lengths. We can pick between the following topologies: 1. MAP (Maximum a posteriori tree) The topology with the maximum posterior probability (similar to ML tree) 2. Majority rule consensus tree the tree constructed so it contains all of the clades that occur in at least X % of the trees in the posterior distribution 3. 95 % credible set The set of all tree topologies that accounts for 95 % of the posterior probability Node support is displayed as the posterior probability (PP) 1.0 or 100 = full PP support for the node

Answer 22

A type of conditional probability that results from updating the prior probability with information summarised by the likelihood through an application of Bayes' theorem Bayes theorem - Describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Answer 23

Advantages: - You can pick between a range of models with variable rates of genetic changes - It accounts for uncertainties in the tree topologies - Gives you not one tree but a set of most probable trees - Good for dating lineages, can be calibrated with the fossil record Disadvantages: - Takes a long time - The quality depends on how well you sample tree space - Can be really complex and complicated to set up - Based on prior beliefs that could be wrong

Answer 24

For a given gene region, the rate of molecular sequence evolution (amino acid replacement, nucleotide substitution, etc.) is stochastically constant through time and across lineages This means that sequence divergence is proportional to time giving us a uniform mutation/substitution rate and that we can calibrate a clock based on this so we can infer evolutionary history from molecular data.

Answer 25

Assumption that all sequences in an alignment have the same underlying rate of substitution. Some datasets are "clock-like", e.g. closely related species, but we should expect rates of evolution to vary between lineages, i.e. rates of evolution speed up/slow down across lineages and genes over time, also known as rate heterogeneity

Answer 26

Rates of evolution differ between lineages and genes over time. This could be due to differences in DNA repair efficiency, metabolic rates, generation times, population sizes for lineages and for genes it could be variation in gene and protein structure and functions

Answer 27

"Local clock": We allow lineages within a phylogeny to have different substitution rates and can assign lineages (branches) to different rate categories "Relaxed clock": Every branch can have its own rate It's impossible to estimate for every branch, so rates are predicted along the phylogeny based on a model of molecular evolution This is what the "Beast2" program does

Answer 28

Legacy rates from similar taxa/genes Legacy calibration from previous studies Biogeographic events - Barrier formations (closure of the Isthmus of Panama, closure of the Tethys pathway) - Climatic events (Mass extinctions) Fossil data (strata, fragments, full fossil specimens)

Answer 29

1. Point calibration (If we found one fossil specimen at 50 mya we set the node there.' 2. Max/min age constraint (The fossil was found in a deposit that was between X and Y mya, you set the node as a range) 3. Parametric Distribution (Multiple fossils relating to a node, we can create a distribution relating to the probability of the node, exponential, lognormal, normal)

Answer 30

Population and quantitative genetics Paleobiology Phylogenetics PCM is the analytical study of species, populations, and individuals in a historical framework to elucidate the mechanisms at the origin of the diversity of life

Answer 31

The bigger the genome, the lower the mutation rate. Smaller genomes have higher mutation rates.

Answer 32

Shorter generation time = Faster rate of molecular evolution. Genomes get copied more frequently for shorter generation times and therefore collect more errors per unit time

Answer 33

A paper published in science looked at 5 clades of flowering plants to see whether there was a link between molecular rate of evolution and life history (generation time). Built phylogenetic trees, ML,100 bootstraps. For each branch they calculated number of substitutions per nucleotide per mio years. They found that herbs (short generation times) had 2.7-10 times higher rates of molecular evolution compared to trees and shrubs

Answer 34

Phylogenetic non-independence: Related species may share the same traits, so e.g. one single heritable trait that arose one time can be inherited by many descendants, and this can show up as a bunch of data points, even though it should only technically count as ONE data point. I.e. we risk counting one instance of change multiple times. This gives us an artificially high level of certainty about the relationship we see. Making sure to have phylogenetically independent sister pairs means that the analysis is NOT artificially robust, but more likely to show a true relationship

Answer 35

143 species of invertebrate, 14 genes (mt and n), phylogenetically independent sister pairs. Found a relationship that shorter generation times had higher rates of molecular evolution. BEAR in mind that longer generation time is related to larger body size

Answer 36

Martin and Palumbi confirmed that larger body size is related to lower rate of molecular evolution. BUT ALSO they found that homeotherms (i.e. warm blooded animals that can regulate their own body temperature) have higher rates of evolution compared to poikilotherms (i.e. cold blooded animals that rely on their environment to heat them up)

Answer 37

It is assumed that having a large body with more cells requires higher DNA fidelity and repair. HOWEVER, correlation between DNA repair efficiency and body size has not yet been established. In invertebrates, there is no evidence for influence of body size on substitution rate. Metabolic rate might be the cause because of the mutagenic oxygen radicals produced

Answer 38

1. Metabolism, i.e. aerobic respiration, produces oxygen radicals that are mutagenic. We would expect mitochondrial DNA to be the most impacted because that is where 90 % of the oxygen in a cell is used. 2. Higher metabolic rates mean that we have more energy to do things with, e.g. DNA synthesis and nucleotide replacement. Lower metabolic rate = lower turnover, less frequent repair

Answer 39

Mearnsia, a type of plant, from many locations (Phillipines down to New Zealand) was sequenced and a ML phylogenetic tree was built. The temperatures related with the branchlengths of the tree, indicating that the temperature impacts the rate of molecular evolution rate. Greater biologically available energy and the correlate productivity.

Answer 40

Molecular evolution is caused primarily by neutral mutations that randomly drift to fixation in a population resulting in nucleotide substitutions

Answer 41

Apparently, the rate of molecular evolution is slower in birds living on islands/in small areas compared to birds living on the mainland or in large areas. This is taken to mean that population size impacts the rates of both population-level mutagenesis and gene fixation. More reproducing individuals means more DNA replication error. THIS HAS IMPORTANT IMPLICATIONS for biodiversity conservation: If we put species in limited refugia we slow the tempo of their microevolution which can limit the potential for adaptive shifts in response to changing environments

Answer 42

Species richness = Speciation - Extinction

Answer 43

Latitudinal gradient hypothesis: There are more species in the tropics than in the polar regions even when compensating for reduced area. However, Pete contributed to a paper on marine fishes that concluded that speciation rates were higher in the polar regions.

Answer 44

There are fewer species higher up, i.e. species richness decreases with increasing altitude. Maybe partially due to the energy available to the different ecosystems, warmer lower down = more energy.

Answer 45

DIfferences in - Population size - Generation time - Mechanisms of pollination and seed dispersal - Strength of sexual selection - Climatic effects - Landscape heterogeneity - etc.

Answer 46

The particular branching pattern of a tree

Answer 47

If the tree is very balanced, i.e. has a balanced topology, it can indicate competition between close relatives; more competition, and less diversification in large clades. If the tree is very imbalanced it can indicate that there are heritable characters that effect diversification e.g. key innovations that expand the available niche space and thereby increase the diversification e.g. island colonisers with no competition (these are often more diverse than mainland relatives) e.g. sexual selection leads to increased speciation (broadcast spawners show low diversification)

Answer 48

Early diversification (long branch lengths) can indicate adaptive radiation e.g. Cambrian explosion. Evolution of ecological and phenotypic diversity within a rapidly multiplying lineage. (e.g. eyes, armor or increase in oxygen levels) Late diversification can be a case of island clades being more diverse than their closest mainland relatives, just diversifying like mad because they have no competition

Answer 49

How many new lineages have arisen over time in a phylogeny with time on the x-axis and ln(no of lineages) on the y-axis. This allows us to see whether the diversification rate has been constant over time or not Abrupt changes could indicate e.g. a climatic event or a key innovation

Answer 50

A method of measuring if per-lineage speciation and extinction rates have remained constant through time Rejection might provide evidence of adaptive radiations of key adaptations Assumes that you have a complete phylogeny in the clade you are interested in. Null hypothesis = Constant rate of diversification Null is rejected at the 5 % level if gamma is less than -1.645

Answer 51

The assumption of the constant rate test (gamma statistic) is complete taxon sampling, where you have every single species within a clade represented in your phylogenetic tree. This is very hard to do. Yet it's so important because it may change the outcome of your analysis

Answer 52

We can address it with a simulation. Make 1000 trees with 18 taxa under a constant rate of diversification (If you believe that 18 is the number of species in your clade) and then randomly prune 7 taxa from each of the trees. THis allows us to build confidence intervals around our constant rate LTT plot. and see if the observed LTT plot falls within the confidence interval.

Answer 53

Authors tried to collect as many species of leaf beetle as possible, but could only find 83 out of 202 known extant species, which is 41 %. Generated sequences from both mt and nDNA, chose a substitution model, and built a lot of trees using different methods (parsimony, ML, and Bayesian inference), used penalised likelihood (relaxed clock method) to generate an ultrametric tree (time tree), used fossils to callibrate tree and estimated gamma statistics. Generated LTT plots generated 1000 replicate trees with 202 trees and pruned 119 from each to get a mean LTT + 95 % confidence interval. Identified points of significant diversification rate shifts and tried to account for this change by using high latitude sea surface temperatures as a proxy for global climate. Conclude that the KT boundary opened up a lot of niches which made it possible for the leaf beetles to diversify (Adaptive radiation). Furthermore, global warming made it possible for the beetles to expand latitudinally, making them more diverse, because of the following taxonomic diversification of tropical plant lineages that they use as hosts Slowing of diversification observed after the warming period as niches have been saturated with lead beetles also, tropical plants retreated back to lower latitudes due to global cooling

Module 3 Flashcards

(77 cards)