Phylogenetic reconstruction Flashcards

Question

Jukes-Cantor DNA substitutions models

Answer 1

* Most parsimonious * Assumes 25% of posibility to each base * Simple model (1 parameter)

Answer 2

* @ Transitions * ß Transvertions * @ different from ß or else: JC * 2 parameter

Answer 3

Unequal base frequencies Π Substitutions equally likely 2 paremeters

Answer 4

Transversions and transitions with different substitution rates 3 parameters

Answer 5

6 parameter Takes in to account that all transvertions don´t have the same posibilities @ is different from ß All 6 pairs of substitutions have different rates

Answer 6

* Proportion of invariant sites * Gamma distribution

Answer 7

sequences that evolve fast may show less divergence than sequences than slower sequences - Not all nucleotides evolve freely GTR+I

Answer 8

Nucleotides vary differently, some vary more freely than others. Not equally distributed - Allow more than 2 categories (zero and non-cero rates) GTR+I+G

Answer 9

Problem: The more complex a model, the more computationally expensive. but: If a model is too generalizing, the inferred phylogeny can be wrong. Therefore: Model, that is significant better than others but does not require more parameters than necessary. - Run a model test

Answer 10

hRLT for nested models - Likelihood of the different models - Until there are not significantly differences between the models

Answer 11

* Maximum likelihood * Bayesian inference * have an explicit probabilistic model * have statistical basis / support * search parameters for most likely answer

Answer 12

A priori assumptions The probability of the event of interest under certain conditions. (conditional probability) Likelihood and Prior probability

Answer 13

1: Robot is programmed to walk a pre-defined amount of steps (also called generations), e.g., 2,000,0000 2: Robot evaluates every step in varying length and direction: - if the step is uphill (higher likelihood): always takes step - if the step is downhill: 1. robot calculates a height ratio between the steps 2. generates a random number between 0 and 1 3. if number lower than ratio: take the step if number higher than ratio: it stays at same place 3: Robot evaluates following step... 4: Position (tree topology) of e.g. every 100th step is sampled.

Answer 14

Bootstrap: Index that best supports the data given, not a true stadistic. (Split) Posterior Probabilities: The tree that best supports the data.

Answer 15

- multiple runs (time intensive) - loooooooong runs (time intensive) - multiple Markov Chains (robots) simultaneously (Metropolis Coupled Markov Chain Monte Carlo = MCMCMC = MC3) -one chain as usual (cold chain) -other chains can make larger steps (heated chains) -chain with the highest probability at every step becomes automatically cold chain and is sampled.

Answer 16

**ML** Stadistical knowledge no priors Unpredictible running time Branch support can take ages Heuristic search: get stuckin local optima **bayes** No stadistical knowledge Priors T=linear computational complex Branch support inmediatly convergence at burn in

Answer 17

* Bootstrap * Jacknife * Bremer supports (Decay index)

Answer 18

1) Characters are resampled with replacement \> many (100...1000...10,000)... bootstrap replicate data sets 2) Tree from each bootstrap replicate reconstructed 3) Majority-rule consensus of all trees \> Visualization of agreement in topologies 4) Majority rule consensus indices = measure of support for those groups = bootstrap proportions (BPs), - Tells support, but bot quality of the tree - Can be wrong if sampled the wrong kind of data

Answer 19

-Jackknifing is very similar to bootstrapping • differs only in resampling strategy • proportion of characters (e.g. 50%) is deleted • Results summarized with a majority-rule consensus tree • Majority rule indices = Jackknife Probabilities • Jackknifing and bootstrapping tend to produce: – broadly similar results – similar interpretations - cutting-off characters

Answer 20

* The number of extra steps it takes to collapse a group * Add aditional steps, to see if the topolofy remains * The higher the number, higher the support

Answer 21

(conditional probability \* Prior probability) / probability of the data given a specific model

Answer 22

Every step is called a generation Cloud: Sampling a large amount of potential trees Program to always go up: down only under certain conditions

Answer 23

**Consistency index Ci** (Deals with apomorphies) Ci=( (minimum total SUM of character changes expected)/(actual amount of steps))\*100 Also useful to compare trees, to check the amount of homoplasies. The higher the Ci: the better (how good the data is, and how the characters can be included in the trees) With Binary characters (0-1): (each character expected to change only one. (parsimony) in the tree) CI=1 if there is no homoplasy negatively correlated with the number of species sampled **Homoplasy index Hi:** The amount of homoplasies= 1-Ci 0,85-1=-0,15 **Retention index** Ri: ((Max steps on the tree - number of state changes in tree)/( Max steps on the tree - number of state changes) Ri= (Max N. Of steps-Steps observed)/(Max N. Of steps-min. Steps) defined to be 0 for parsimony uninformative characters RI=1 if the character fits perfectly RI=0 if the tree fits the character as poorly as possible