Maximum Parsimony Flashcards

1
Q

Evolutionary theory speciation evolution of new organisms is driven by

A

Mutation and Selection bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The DNA sequence can be changed due to single base changes, deletion/ insertion of DNA segments

A

Mutation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Quantifies the factor by which a mutation with effect is more or less likely to be chosen during the population sampling after it first occurs

A

Selection Bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

_____ event leads to creation of different species

A

Speciation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True or false: Speciation caused by physical separation into groups where different genetic variants become dominant

A

True : basahin mo ulit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define Evolution Theory

A

Any two species share a (possibly distant) common ancestor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

DNA and protein sequences evolve at a rate that is relative constant over time and among different organisms

A

Molecular clock hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is stated in Indiana University by Michael Lynch, Jeff Palmer, Matt Hann et al) in Lynch: The origin of Genome Complexity

A

According to this model, much of the restructuring of eukaryotic genomes was initiated by nonadaptive processes and this is turn provided novel substrates for the secondary evolution of phenotypic complexity by natural selection sana binasa mo hanggang dito

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A graph reflecting the approximate distances between a set of objects (species, genes, proteins, families) in a hierarchical fashion

A

Phylogenetic Tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Current species; sequences in current species

A

Leaves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Hypothetical common ancestor

A

Internal Nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

“Time” from one speciation to the next (branching represents speciation into new species)

A

Branches (edges) Length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

This satisfies molecular clock hypothesis all leaves at same distance from the root

A

Rooted Tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Branches are also called

A

edges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does edges reflect

A

Evolutionary distances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Classical phylogenetic analysis

A

Morphological features: presence pr absence of fins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Modern biological methods allow to use molecular features

A

Gene sequences and Protein Sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A phylogenetic tree that represents the evolutionary pathways of a group of species

A

Species tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

A phylogenetic tree constructed from a single gene from each of the species under study

A

Gene tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

We can get different trees

A
  • Input sequences
  • Multiple alignment programs
  • Substitution models
  • Phylogenetic tree reconstruction methods
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Display one sequence above another with spaces (termed gaps) inserted in both to reveal similarity of nucleotides or amino acids

A

Sequence alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Gaps represents ____

A

Indels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Mismatch represents

A

Mutations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Insertion and Deletion represents

A

Indels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Aligns two or more sequences to highlight their similarity, inserting a small number of gaps into each sequences (usually denoted by dashes) to align wherever possible identical or similar characters

A

Basic Sequence Alignment Algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Aligns two sequences to identify similarities/differences.

A

Pairwise Alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Handling large datasets, optimizing alignments for highly divergent sequences.

A

Multiple Sequence alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Aligns the most similar subsequence within the sequences.

A

Local Alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Aligns the entire length of both sequences from start to end.

A

Global Alignment

30
Q

useful when searching for similar regions within sequences that might differ significantly overall.

A

Local Alignment

31
Q

best for comparing sequences of similar length to assess their overall similarity and evolutionary relationship.

A

Global Alignment

32
Q

Why compare biological sequences

A

To obtain functional or mechanistic insight about a sequence by inference from another potentially better characterized sequence
* To find whether two (or more) genes or proteins are evolutionarily related
* To find structurally or functionally similar regions within sequences (e.g. catalytic sites, binding sites for other molecules, etc)

33
Q

Distance based tree methods

A

UPGMA and NJ

34
Q

Character based (discrete) tree methods

A

Maximum Parsimony, Maximum Likelihood, Bayesian Methods

35
Q

Distance methods are

A

Relationships based upon sequence similarity

36
Q

Advantages of Distance method

A
  • Computationally fast
  • Single Best tree found
37
Q

Disadvantages of Distance methods

A

Assumptions
* additive distances (always)
* molecular clock (sometimes)
Information loss occurs due to data transformation
Uninterpretable branch lengths
Single “best tree” found..

38
Q

These methods attempt to map the history of gene sequences onto a tree and decide what the tree looks like

A

Character based methods

39
Q

How to choose the best tree

A

To decide which tree is best we can use an optimality criterion.
* Parsimony is one such criterion (the other criteria: Maximum likelihood, minimum evolution, bayesian)
* It chooses the tree which requires the fewest mutations to explain the data.
* The Principle of Parsimony is the general scientific principle that accepts the simplest of two explanations as preferable.

40
Q

Principle of Parsimony

A

Looks for a tree with the minimum total number of substitutions of symbols between species and their ancestors in the phylogenetic tree

41
Q

The preferred evolutionary tree is the one that requires “the minimum net amount of evolution”

A

Principle of Parsimony

42
Q

Maximum Parsimony, Because character conflict, homoplasy, is common we need a method to resolve this conflict

A

We can brush aside the problem and use an algorithmic method, like neighbor-joining which builds one tree from distance data [NOT recommended] - more on this later
* Or we can use an optimality criterion that allows us to rank alternate trees from best to worst

43
Q

Maximum Parsimony is an

hypotheses that explain the data equally well, choose the simplest one
- Choice of simplest hypothesis is a good rule of thumb (but remember, the data matter far more than the method!)

A

Parsimony is an optimality criterion

44
Q

Maximum Parsimony Prefer

A

the tree or trees that minimizes the amount of evolutionary change required to explain the data

45
Q

Based on ______ “shave away all that is unnecessary” - plurality should not be posited without necessity; when there are multiple

A

Ockhams razor

46
Q

Maximum Parsimony

A

hypotheses that explain the data equally well, choose the simplest one
- Choice of simplest hypothesis is a good rule of thumb (but remember, the data matter far more than the method!)

47
Q

Parsimony will allow one to find the tree that minimizes homoplasy, aka the

A

Shortest tree

48
Q

Parsimony eh eh eh basahin mo lang tong answer

A

but if you have made careless homology decisions (e.g. poorly aligned your data) even the most parsimonious tree may be horribly wrong
* Thus, some dadists emphasize that we dont use parsimony because it is the method most likely to find the true tree - we use it because it provides the “least falsified” hypothesis (truth is unknowable)

49
Q

Assumption of character-based parsimony

A
  • Each taxa is described by a set of characters
  • Each character can be in one of finite number of states
  • In one step certain changes are allowed in character states
  • Goal: find evolutionary tree that explains the states of the taxa with minimal number of changes
50
Q

In parsimony, the score is simply the minimum number of mutations that could possibly produce the data.
* Pro: ?
* Con: ?

A
  • Pro: There are fast algorithms that guarantee that any tree can be scored correctly
  • Con: There are lots of possible trees to choose between…
51
Q

Drawbacks of Maximum Parsimony

A

the score of a tree is completely determined by the minimum number of mutations among all of the reconstructions of ancestral sequences.
* fails to account for the fact that the number of changes is unlikely to be equal on all branches in the tree.
o As a result, susceptible to “long-branch attraction”, in which two long branches that are not adjacent on the true tree are inferred to be closest relatives
* in practice this is still pretty good…ML/Bayesian better sana binasa mo hanggang dito

52
Q

any test or metric that uses random sampling with replacement and falls under the broader class of resampling methods.

A

Bootstrapping

53
Q

uses sampling with replacement to estimate the sampling distribution for the desired estimator.

A

Bootstrapping

54
Q

used to assess the reliability of sequence based phylogeny.

A

Bootstrapping

55
Q

Define bootstrapping

A

Bootstrap values in a phylogenetic tree indicate that out of 100, how many times the same branch is observed when repeating the generation of a phylogenetic tree on a resampled set of data.
* If we get this observation 100 times out of 100, then this supports your result.

56
Q

The result of multiple substitutions at the same site in a sequence, or identical substitution in different sequences such that the apparent sequence divergence rate is lower than the actual divergence that has occurred

A

Genetic Saturation

57
Q

Saturation affects in _____ where most distant lineages have misleadingly short branch lengths which also decreases phylogenetic information contained in the sequences

A

Long Branch Attraction (LBA)

58
Q

is a process where genetic material moves between organisms in a way other than traditional parent-to-offspring inheritance (vertical transfer).

A

Horizontal Gene Transfer

59
Q

Sequences diverged after a speciation event

60
Q

Sequences diverged after a duplication event

61
Q

Sequences Diverged after a horizontal transfer

62
Q

Maximum Parsimony
Optimality criterion:

A

The ‘most-parsimonious’ tree is the one that requires the fewest number of evolutionary events (e.g., nucleotide substitutions, amino acid replacements) to explain the sequences.

63
Q

Advantage of Maximum Parsimony

A

Are simple, intuitive, and logical (many possible by pencil-and-paper).
* Can be used on molecular and non-molecular (e.g., morphological data.
* Can tease apart types of similarity (shared-derived, shared-ancestral, homoplasy
* Can be used for character (can infer the exact substitutions) and rate analysis.
* Can be used to infer the sequences of the extinct (hypothetical) ancestors.

64
Q

Disadvantages of Maximum Parsimony

A

Are simple, intuitive, and logical (derived from “Medieval togic”, not statisticsl)
* Can be fooled by high levels of homoplasy (same’ events).
* Can become positively misleading in the “Felenstein Zone”

65
Q

Phylogeny (phylogenetic tree) reconstruction:
overview

A
  • Tree topology & branch lengths
  • Computational challenge
  • Huge number of tree topology
    3 sequences: 1 (unrooted)
    4 sequences: 3
    5 sequences: 15
    10 sequences: 2,027,025
    20 sequences: 221,643,095,476,699, 771,875 n sequences (unrooted & rooted) ??
66
Q

Phylogeny (phylogenetic tree) reconstruction:
most methods are

A

Heuristic, is a mental shortcut or practical approach used to solve problems or make decisions quickly

67
Q

Phylogeny (phylogenetic tree) reconstruction: Two types of methods

A

Distance based (input: distance matrix; UPGMA, NJ)
Charactr based (input: multiple alignment)

68
Q

Models of evolutionary distance

A

Many HHAHAHAH

69
Q

Model of ED: Equal probability of change to any nucleotide

A

Jukes-Cantor Model (Simplest case)

70
Q

Different probabilities for transitions, transversions

71
Q

Different probabilities for transitions, transversions, & takes into account genomic nucleotide bases