Lecture 5 Flashcards

1
Q

What is a tree in context of phylogeny?

A

Its a graph consisting of nodes and branches without a loop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an unrooted phylogenetic tree? draw one

A

a tree with two types of nodes:
* tip/leaf: node with 1 branch attached
* internal node: node with 3 branches attached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a rooted phylogenetic tree? draw one

A

a tree in which one branch is subdivided by a new node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can unrooted trees be rooted? if yes, how so?

A

yes. with an outgroup( a distantly related individual ) which means that the branch ending in the outgroup is subdivided by the root node.
and it is chosen to be a very distantly related organism to the remaining organism in the tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Each branch may have a length of ?

A

> /0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a pedant branch?

A

branch attached to a tip

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a cherry?

A

a pair of tips only separated by one internal node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a caterpillar tree?

A

a tree with only one cherry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is a monophyletic group or clade?

A

Its all descendants of a common ancestor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an ultrametric tree? show on a tree

A

Sum of all branch lengths from any tip to the root is the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Polytomy

A

the definition of a phylogenetic tree is extended so that internal nodes have more than 3 branches attached. this node is called polytomy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the string representation of this tree? refer to slide 8

A

((B:1, C:1):1, A:2):1,D:3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the string representation of this tree? refer to slide 9

A

((A:2, D:4) :1, B:1, C:1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Can a tree have multiple newick representations?

A

yes but they are all equivalent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

only? contributes to branch lengths

A

vertical distances along the evolutionally time axes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In Charles Darwin’s representation what are the tips and what are the branches?

A

tips are the species, and the branches are the ancestry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In a pylogeny of species of simians what are the branching events and branch length?

A

branch events are speciation events
branch lengths are time between speciation events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In a pathogen phylogeny of HIV epidemic, what are the tips, branching events and branch lengths?

A
  • different infected hosts
  • transmutation events
  • time between transmission events
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What was the data used for measuring similarity between species previously and currently?

A

previously: morphology
currently: typically sequencing data for species or pathogens or B-cells, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Name 3 ways of defining similar between species

A

Phenetic, cladistic, mechanistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is phonetic based on?

A
  • its based on over all similarity
  • pairwise distance based
22
Q

what methods does phonetic use?

A

UPGMA, least square algorithm

23
Q

What is cladistic based on?

A

shared characteristics
character based

24
Q

what methods does cladistic use?

25
Q

What is mechanistic based on?

A

-evolutionary model
-character-based

26
Q

what methods does mechanistic use?

A

maximum likelihood, Bayesian inference

27
Q

In an alignment each site is a ?

28
Q

How is each alignment obtained?

A

from raw sequencing reads by putting reads such that number of mutations, insertions and deletions are minimised

29
Q

What is the basic idea of distance based methods?

A

1-we define how to measure distance between sequences (JC69,etc)
2- Calculate the distance between all pairs of sequences
3- find a tree where the distances follow the sequence distance matrix most closely

30
Q

What are two strategies of distance based methods?

A

algorithmic and optimality

31
Q

How does the algorithmic method work?

A

its a sequence of steps where iteraretively smallest distances are clustered in a tree

32
Q

How does the optimality approach work?

A

Using a cost function, it minimises the difference of the sequence distance matrix to the inferred tree distances

33
Q

UPGMA assumes evolution according to what?

A

a strict molecular clock in which the rate of DNA/RNA/ Protein sequence evolution is constant over time

34
Q

What is the output and input of the UPGMA?

A

input is the distance matrix, output is the ultrametric phylogenetic tree

35
Q

Find the UPGMA tree for sequences below based on the hamming distance matrix for :
s1:TCACACCT
s2:ACAGACTT
s3: AAAGACTT
s4: ACACACCC

36
Q

How does the least square method work?

A

It defined a cost function which minimised the sum of differences between the distance matrix and the tree distance matrix for a proposed tree

37
Q

What is the runtime of UPGMA? How did you find that?

A

O(n^3) for n sequences.
n: for pruning nodes (replacing a cherry with a new node)
n^2 : for creating the distance matrix
therefore n^3 in total

38
Q

How many trees can n=1,2,3 tips make?

A

n=1,2 both 1 for n=3, 3

39
Q

For runtime of least square methods what shall we do?

A

we need to optimise the cost function and therefore we need to visit each tree in the space of trees, therefore we need to find how many trees on n tips exist: number of rooted and unrooted trees on n tips

40
Q

if we have n tips, how many branches do we have?

41
Q

How many unrooted trees with n tips exist?

42
Q

How many rooted trees on n tips exist?

43
Q

The least square decision problem is an — problem, thus there is no — time algorithm unless –, so we have to check — trees with —.

A

NP-complete, polynomial, P=NP, all, n tips

44
Q

Is UPGMA consistent? Explain further

A

yes, it is. The distance matrix tends towards the tree distances, therefore we cover the true tree

45
Q

IS Least square method consistent? Explain further

A

Yes, the squared difference between the calculated matrix and the tree distance tends towards 0, therefore the true tree is a least squares tree.

46
Q

UPGA and neighbour joining algorithms have a running time of :

A

polynomial time

47
Q

running time of least square methods is ?

A

NP complete

48
Q

What are two problems of phenetic approaches?

A
  • they disregard information beyond pairwise distances
  • large distances come with large variances which are typically ignored.
49
Q

What is the minimal and maximum number of cherries in a phylogenetic tree with 99 tips ?

A

49 cherries and one left over

50
Q

In how many ways can you write a network string for a rooted tree with species A,B,C? In how many ways can you write it for n species?

A

4 ways, answer in slides, 2^(n-1)

51
Q

Consider the least square method, why would we use weights wi,j which aren’t equal to 1?

A

we don’t have infinite amounts of sequence data, we need to down weight the contribution of weights for distance matrix with a lot of noise wij=1/Dij( the estimated pairwise distance matrix)