phylogenetics Flashcards
1
Q
phylogenetics
A
- aims to infer ancestral relationships among sets of species
- requires some representation of uncertainty
- statistical problem
- mathematical and computational techniques involved
- trees
2
Q
phylogenetic trees
A
- align similar sequences first
- rooted or unrooted
- rooted shows a basal ancestral root
- each node represents most recent common ancestor
- direction of ancestral relationship implied
- unrooted shows simialrity without asssuming ancestry
- rooted shows a basal ancestral root
- when n>3, number of unrooted trees always fewer than rooted
- number of possible trees grows rapidly as species number increases
3
Q
distance
A
- distance between 2 sequences has to fulfil:
- d(i,j) > 0 when i ≠ j
- d(i,i) = 0
- d(i,j) = d(j,i)
- d(i,j) ≤ d(i,k) + d(k,j)
4
Q
UPGMA
A
- unweighted pair group method
- clustering method using arithmetic averages from a distance matrix
- each species assigned its own cluster
- 2 clusters amalgamated at each stage
- combine minimally separated clusters
- creates new node
- continue until 1 cluster remains
- molecular clock tree
- assumes constant evolutionary rate
5
Q
neighbour joining
A
- agglomerative clustering method
- uses distance matrix
- connect 2 leaves of minimum distance with a new node (leaf)
- new node is common ancestral node
- connect new node to central node
- calculate distances and repeat
- recursive algorithm
- difficult when pairs of leaves have identical sequences
- with high numbers of sequences
- need heursitic method
6
Q
maximum parsimony
A
- if i and j differ by n nucleotides:
- at least n changes must have occurred since they separated
- find tree that involves the minimum number of evolutionary events to have occurred
- strong assumption that this is the best tree
- invovles evaluating the evolutionary cost of each tree and searching through all possible trees
7
Q
evolutionary models
A
- probability models that specify either:
- probability of a given sequence
- probabilities of given mutations
- markov models
- used to calculate likelihood of a tree
- Jukes cantor, kimura
8
Q
jukes cantor
A
- continous-time markov processes to define transition probability matrix
- assumes equal base frequencies so each row sums to 1
- assumes equal mutation rates
- unrealistic
- time-reversible
9
Q
kimura model
A
- distinguishes between transition and transversion
- purine to purine and purine to pyrimidine etc
- assumes equal base frequencies
10
Q
phylogenetic footprinting
A
- using phylogenetic information to estimate how evolutionary rate varies along a sequence
- slowly evolving elements indicate conserved regulatory regions
- involves comparing small numbe rof distantly related species
- shadowing is a variation comparing panels of closely related species