Lecture 3 – Molecular phylogenetics Flashcards

1
Q

what is phylogenetics?

A

reconstructing patterns of shared ancestry between organisms, either among or within a species

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is taxonomy

A

describing, naming, and classifying species- the organisation of organisms based on phylogenetic/other info

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

orthologous sequence

A

from different species- used to look at speciation, extinction etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

homologous sequences

A

from the same species- can be used to look at population genetics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

paralogous sequences

A

different genes within the same genome- can be used to look at gene duplication, deletion etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

advantages of using molecular characters ratjer than morphological ones

A

-they are more objective and therefore easier to quantify
-available even when morphology is uninformative, e.g. for microorganisms
-cheap and fast
-don’t require specialist training to obtain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

disadvantage of using molecular characters rather than morphological ones

A

can’t be used to look at extinct species, most of the time- leaves gaps in phylogenies etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

different types of SNP

A

transition- purine-purine etc
transversion- purine-pyrimidine or vice versa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is sequence alignment

A

alignment of multiple genetic sequences based on positional homology- the idea that there will be conserved sequences at set positions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how computer programs are useful in sequence alignment

A

there are multiple possible alignments for each 2 sequences, so algorithms are useful to determine the likelihood that each alignment is the correct one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

things that can complicate alignment

A

long indels, a lot of genetic diversity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

examples of programs used for alignment

A

clustal and muscle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is p-distance

A

proportion of mismatched sites, very simple measure of genetic difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the multiple hits problem

A

once you get to high observed genetic changes, the actual number of changes is probably higher- points where there have been multiple substitutions can go undetected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

things that help solve the mhp

A

generating nucleotide substitution models, so you can project the observed distance onto the likely actual distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

tools useful in nucleotide substitution models

A

Jukes-Cantor model, which looks at appropriate nucleotide substitutions

17
Q

amino acid substitution models- useful things

A

JTT matrix- looks at the actual frequencies of substitutions, with rates obtained from a large survey of protein variation

18
Q

assumptions within JTT matrix

A

evolution at each site occurs at the same rate
nucleotide base species are always the same for all species
evolution at each site is independent- can’t really avoid this one, but sometimes it isn’t true, e.g. if there are secondary nucleic acid structures

19
Q

how can among-site variation be accounted for>

A

gamma distribution model- this can model the heterogeneity in site evolution in a fairly accurate way, helping to create a more accurate level of change- genetic distances tend to be higher using these models

20
Q

what are boostrap values on a phylogenetic tree

A

measure of phylogenetic uncertainty

21
Q

rooted vs unrooted tree

A

rooted has an evolutionary direction, and only horizontal lines represent genetic distance
unrooted tree- no direction, and all lines represent genetic distance

22
Q

algorithmic methods- how it works, example

A

genetic distances for each pair are ‘clustered’- e.g. neighbour-joining

23
Q

optimality methods-how it works, example

A

score to all possible trees based on data, and an optimisation algorithm finds the highest scores. maximum parsimony, maximum likelihood, bayesian inference

24
Q

statistical methods- how it works, example

A

probability for each possible tree- more of a formal statistical problem. maximum likelihood, bayesian inference

25
Q

maximum parsimony tree- principle

A

tree which requires the fewest evolutionary changes is the best one, fast but not good for high divergence

26
Q

maximum likelihood tree- principle

A

finds the tree which is most likely to have led to finding the observed species using nucleotide substitution models

27
Q

bayesian inference- principle

A

looks at probability distribution, rather than the probability of individual trees- similar to max likelihood

28
Q

what is a parsinomy score

A

minimum number of evolutionary changes required to explain observed characters- the scores can be added together on a tree

29
Q

what is a ‘hill climbing’ method?

A

searches through trees using trial and error, but doesn’t check through all trees- just ones that may get closer to the optimum