Lecture 12 - Phylogenetic Models Flashcards

1
Q

is there a constant rate of mutation seen in all branches

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

are multiple changes seen at individual sites

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are possible factors in models of molecular evolution

A
  • different substitution preferences
  • different rates at different sequence positions
  • different rates on different branches of the tree
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are some measures of evolutionary distance

A
  • fractional alignment/p-distance
  • poisson distance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the calculation of fraction alignment/p-distance

A

p = D/L
- D is the number of observed changes
- L is the length of the sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does Poisson distance account for

A

multiple substitutions at individual sites

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the probability of one of two aligned positions changing [Poisson distance]

A

p = 1-e^{-2rt}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the calculation for the Poisson distance (d_p)

A

d_p = -ln(1-p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the goal of nucleotide models

A

to effectively represent nucleotide changes within a set of sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the assumptions of the Jukes-Cantor Model

A
  • all sites are independent
  • rates of evolution are the same at all sites
  • all substitutions are equally likely, and occur at rate α
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the chance of a nucleotide not changing in the Jukes-Cantor model

A

1 - 3α

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what does the Kimura Two-Parameter (K2P) model account for

A

different rates for transitions (α) and transversions (β)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

which occur at a lower rate: transitions or transversions

A

transversions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does the KHY85 model account for

A

corrects for the ratio of nucleotide composition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does the Generalized Time-Reversible (GTR) model account for

A

nucleotide composition and different rates for all possible reversible transitions and transversions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what can be used to correct for different rates at different positions

A

the Gamma distribution

17
Q

how are protein models commonly derived

A

using empirically derived substitution matrices

18
Q

how does parameter number affect choosing a nucleotide model

A

too few -> inaccuracy, convergence upon the wrong tree
too many -> reduces statistical power, the ability to reject a hypothesis

19
Q

what is overfitting

A

forcing too many parameters on data that has natural statistical variation

20
Q

what are Modeltest and Prottest

A

algorithms that assess models

21
Q

what is the AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) used for

A

measure of quality used to assess models

22
Q

how does AIC/BIC inform which model to choose

A

the model with the lowest AIC/BIC is selected