Lecture 7 Flashcards by arghavan kassraie

How do we search the tree space for the maximum likelihood tree?

We need to propose different unrooted trees : NNI, SPR, TBR moves
We also need to propose different branch length: multiply each branch length by some factor
We can use hill climbing strategies by taking small step sizes in the tree space to find the optimum tree

How well did you know this?

Not at all

Perfectly

How does the NNI move work? draw

Each branch in the tree connect to different subtrees or nearest neighbours. interchanging a subtree on one side of the branch with another on on the other side is an NNI. Such rearrangement is possible for each internal branch

How well did you know this?

Not at all

Perfectly

How does SPR work? draw

it works by branch swapping by subtree pruning and regrafting. for instance a subtree is pruned and then reattached to a different location on the tree. steps are repeated until the optimal alignment is reached.

How well did you know this?

Not at all

Perfectly

How does TBR work? draw

IT works by branch swapping by tree bisection and reconnection. Tree is broken into two subtrees by cutting an internal branch. two branches one from each subtree are then chosen and rejoined to form a new tree.

How well did you know this?

Not at all

Perfectly

Which evolutionary models can we use for testing?

likelihood ratio test and AIC

How well did you know this?

Not at all

Perfectly

How can we assess confidence of the inferred parameters (phylogeny and the substitution rates)

likelihood ratios and bootstrapping

How well did you know this?

Not at all

Perfectly

Does bootstrapping require ML?

How well did you know this?

Not at all

Perfectly

How do we formulate our null hypothesis in model testing?

We ask if we can reject model H0 in favour of model H1

How well did you know this?

Not at all

Perfectly

How do we formulate the likelihood ratio?

We assume data evolved under a model H0, and a model H1 whithin which H0 is nested. under mild conditions: refer to slide 18

How well did you know this?

Not at all

Perfectly

How does the likelihood ratio test work?

we consider two models: H1 which is the general model and H0 being the null model. we then find the 2*(log l (parameter 1)-logl(parameter0)
if this value lies in the end of the a alpha tail of the chi distribution, we can reject the null model

How well did you know this?

Not at all

Perfectly

if the null model is the tree model, we would falsely reject it in what proportion of the tests?

alpha proportion of the test

How well did you know this?

Not at all

Perfectly

if the null model was the false model, we expect the null model to have —- likelihood than there true model, and we could —- the null model only in a very — proportion of the tests

much lower, accept, low

How well did you know this?

Not at all

Perfectly

What specific fit are we assessing here?

we assess the fit of H0 relative to H1. even tho H1 may be a very bad model, we may reject H0 In favour of H1, since H0 is even worse

How well did you know this?

Not at all

Perfectly

When comparing nested models, the simple model is obtained by —- the parameters of the general model.

restricting

How well did you know this?

Not at all

Perfectly

Could the simple model have free parameters here?

yes, the simple model may include free parameters.

How well did you know this?

Not at all

Perfectly

What is the type I and type II errors?

Type 1: when the simple model is tree but we reject it. Type 2: when the simple model is false but we accept it.

How well did you know this?

Not at all

Perfectly

Accuracy =

Study These Flashcards

1- type I error

Type I error is the —–, and is controlled by setting —-.

Study These Flashcards

significance, alpha

Power =

Study These Flashcards

1- type II error

How do we generally asses the power?

Study These Flashcards

by simulating under the general model and assessing the number of times that H0 is accepted.

if null model is false and is rejected in all experiments the power is estimated to be what?

Study These Flashcards

The power — with an increasing difference of the tree modern and the null model.

Study These Flashcards

increases

AIC is for what type of models?

Study These Flashcards

non-nested

AIC = ?

Study These Flashcards

-2logLi(theta)+2pi, pi is the number of parakeets and Li is the likelihood function of the model I

How AIC used? what is the rationale behind it?

-Its used to calculate the AIC for each model - then the model with the lowest AIC is chosen rationale : AIC basically picks the model with the smallest expected pullback-leibler distance to the true model

Models having AIC within 1-2 of the minimum:

substation support, should receive consideration in inference

Models having AIC within 4-7 of the minimum:

considerably less support

Models having AIC>10 above the minimum:

essentially no support

Slide 21**

How many branches can a rooted tree with 12 species have?

if we want to test JC against GTR, can we do a likelihood ratio test?

yes if we perform the test on the same tree with fixed branch lengths. no if we perform it on different trees, since each tree is a different parameter therefore the models wont be nested anymore and so we need to use AIC

How do we determine the confidence interval in a fixed tree given the evolutionary parameters ?

we determine the value of the log likelihood function in parameter estimate l(Theta;x), we then subtract l(Theta;x) - 0.5 chi-squared. and determine the actual values of the estimate for which the equation above is tree. see slide 51.

Calculating the confidence interval for complex objects such as tree topologies isn't possible using the previous method. so what can we do? what is the limitation of this

We can do more experiments, and ignore the smallest and largest 2.5% of the outcomes and consider the minimum and maximum, however for many question we cant repeat experiments, for instance we cant repeat plant speciation.

So then what else can we do to find the CI?

we can mimic more experiments by bootstrapping, ie creating artificially new datasets.

How does bootstrapping work?

we do testes by relying on random sampling with replacement. if we have enough data initially we. should receive the same results as what we do with the replacement.

How do we use bootrapping for phylogenies based on an alignment sequences with length m?

we sample m sites at random with replacement and infer a phylogeny based on the new data and repeated he procedure many times

Explain each step of maximum likelihood inference.

1- infer a maximum likelihood tree : - employ flesentein's pruning algorithm for each tree and branch lengths - choose the tree with branch lengths which maximise the likelihood - do this for each substation model and calculates its AIC 2- determine the model and tree with the highest support using AIC 3- determine the confidence interval for the substitution model parameters based on the likelihood ratios 4- determine the confidence in maximum likelihood tree using bootstrap

bootstrap is used to determine the confidence in ---, and likelihood ratio is used to determine the confidence interval for the ----.

maximum likelihood tree, substitution model

Is there a way to test how to best root a maximum likelihood tree without employing any extra information?

yes. only unrooted form has a meaning in a likelihood contest

Can you use the bootrstrapping ideas for assessing confidence in UPGMA?

yes. if we have a very large sequecne alignment since we expect to see some variation. we can usually use it for different phylogeny methods.

What is required to infer a direction of transmission from a phylogeny?

by including more sequences , and getting a more detailed tree. if one sequence is contained in one we could tell the direction of transmission. and its a bit difficult to say 100% which one was better.

Lecture 7 Flashcards

(41 cards)