12 | Phylogenetics I Flashcards
What do we need / need to ask in order to get from sequences to a phylogeny?
Sequences
–>
MSA:
- which sequences, which MSA method?
- alignment/data appropriate for question?
- use the entire alignment?
–>
Algorithm/software to infer phylogeny from MSA
- which method?
- can we use entire alignment or need to remove or mask something?
–>
Phlyogeny
- gene trees / species trees
- statistical support?
- biological interpretation?
What is an optimal alignment for phylogenetics?
And what does this mean in more detail?
what is an optimal alignment ?
evolutionary optimal!
= aligned residues are homologous,
share a common ancestry
–> positional homology
MSA, in the context of evolutionary analysis:
a hypothesis about the positional homology of
residues in homologous sequences
Define positional homology and phlylogenetic signal
Positional homology
- aligned residues share a common ancestral residue in the ancestral sequences
- changes in the columns correspond to mutations
- these contain the phylogenetic signal
What three ways could you describe alignment regions in regards to how they influence a phylogeny, and how should each be treated?
positionally homologous
–> contain the phylogenetic signal
uninformative
- highly divergent, many gaps
- correct or incorrectly aligned
- contain no/little phylogenetic signal
–> not necessary to exclude
incorrectly aligned
- positional homology violated
- e.g., non-homologous sequences, misalignment
- leads to incorrect result
–> should be excluded for best results
Removing / masking sequences:
What are the criteria for this?
What are the advantages?
Disadvantages?
trimming non-phylogenetic signal from alignments
criteria:
-gaps
- BLOSUM score per region?
–> different approaches
advantages:
assumed to improve accuracy of:
- tree topology
- branch lengths
- test for selection,…
disadvantages:
- might also inadvertently remove phylogenetic signal
- can also lead to decreased accuracy
Anatomy of a phylogeny
What is the end of a branch called?
tip, leaf, terminal node/vertex
Anatomy of a phylogeny
Name the 4 parts
- tip (leaf, terminal node/vertex)
- branch (edge)
- internal node
- clade
Cladogram vs phylogram?
cladogram: branch lengths meaningless
phylogramm: branch lengths proportional to amount of inferred evolutionary change
What is an unrooted tree?
Unrooted trees illustrate the relatedness of the leaf nodes without making assumptions about ancestry.
How can you root a tree?
using an outgroup
(also possible in similar way with paralog(s))
using “midpoint rooting”
What is an unresolved tree?
- we don’t know the relationship of all branches
- multifurcating / non-binary (a polytomy)
due to networks or incompatible gene trees
What is a polytomy?
hard/soft?
polytomy:
unresolved node
- hard polytomy: rapid divergence
- soft polytomy: binary branching pattern not known, due to insufficient or conflicting data
What is a gene tree?
What does it depict? Which events?
Phylogeny depicting the evolution of homologous sequences
events:
- speciation
- duplication
- loss
- horizontal transfer,
- hybridization
- introgression
- incomplete lineage sorting
- …
phylogeny: a hypothesis that depicts the historical relationships among entities in a branching diagram –> for gene tree those entities are functional domains, gene sequences, or genomic regions (not genomes or organisms!)
Define ortholog
diverged after a speciation event
(last common ancestor is a speciation node)
Define paralog
diverged after a duplication event
(last common ancestor is a duplication node)