Lecture 8 Flashcards
What happens if you have more than one “best tree”?
create a consensus tree
majority-rule consensus tree
includes clades only present in specified number of “best” trees, with the % scores at each node
Confidence in phylogenetic inferences can be thought of in 2 ways:
1) dtmn if there is meaningful signal in dataset
2) assess confidence in particular clades/topological conclusions
detecting nonrandomness in a set:
look at extent to which characters within a matrix contradict each other
Permutation Tail Prob (PTP) Test
- using any method that assigns score to individual tree (pars, ME, ML)
- compares score of shortest optimal tree to scores of trees found with random permuted data sets
parsimony PTP test
if length of shortest optimal tree is shorter than all/nearly all random trees-> data has more phylogenetic structure than would be expected from random
permutation
character states of each character independently shuffled among taxa
point estimate of phylogeny
pars, distance, and likliehood
decay index (Bremer Support)
- difference in tree length between the optimal tree and the optimal tree lacking the clade in question
- higher number means stronger support
- for likelihood: diff in log-likleihood scores (ratios)
bootstrap (general)
- assesses the chances of recovering a particular clade again if we were able to sample from a new set of characters
- simulates other possible datasets by randomly drawing from data
- informs on consistency of branching patterns
nonparametric bootstrap
- sampling with replacement (pseudoreplicate)
- tree search performed on pseudoreplicate datasets and resulting tree(s) added to optimals
- proportion of bootstrap trees with a given clade is the score
- usually presented in a bootstrap consensus tree
jackknife
same as bootstrap but sampling WITHOUT replacement
parametric bootstraping
- generates new data sets by simulating them with evolutionary model
- used mostly in ML analyses to test specific hypotheses, not clade support (ie controversial placing of sister clades)
- random seq conforming to models assumptions placed at base of tree and then allowed to evolve along branches-> repeat for all positions in seq and all branches in tree
Bayesian posterior distribution
- Bayesian not point estimate
- distribution has sample of trees ranked by prob that each is the true tree
Bayesian posterior probability (BPP)
- majority-rule of topology examined
- prob that tree is correct, assuming model is correct
- clade-credibility values (0.0-1.0)
- can be sensitive to model misspecification, use most complex model, faster