quiz 3/13 Flashcards

1
Q

What are the three main types of distance measures covered?

A
  1. Distances for numerical data
  2. Distances for proportions
  3. Distances for presence-absence data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When should you use distance measures for presence-absence data?

A
  • When data is organized as a set of features for different individuals/samples/populations.
  • Examples include:
    • Whether species are present (1) or absent (0) in different locations.
    • Whether viruses express certain genetic markers.
    • Whether bodies of text contain specific words.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is presence-absence data summarized?

A
  • In a contingency table with counts:
    • a = Features present in both samples
    • b = Features present in Sample A but absent in Sample B
    • c = Features present in Sample B but absent in Sample A
    • d = Features absent in both samples
    • n = Total features (a + b + c + d)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can similarity be derived from a distance measure?

A
  • Similarity is often computed as:Similarity = 1 - Distance
  • This works for presence-absence data and proportions.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the Mantel Randomization Test do?

A
  • It compares two distance matrices to see if their distances are correlated.
  • Tests if one type of distance (e.g., economic indicators) relates to another (e.g., health outcomes).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some real-world questions the Mantel Test can answer?

A
  • Do economic indicators impact health outcomes?
  • Does geographic distance affect genetic similarity?
  • Do news announcements influence stock market fluctuations?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does the Mantel Test determine significance?

A
  • Compute the similarity index for all possible random permutations of one of the matrices.
  • Compare the actual similarity to the distribution of randomized similarities.
  • If the observed similarity is much higher than expected by chance, we reject the null hypothesis.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do the Mantel Test results tell us?

A
  • If two distance matrices are significantly correlated, their distances are related.
  • If not, the patterns in one matrix do not predict patterns in the other.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the formula for the Simple Matching Coefficient (SMC)?

A

a+d / a+b+c+d

Counts both presence (a) and absence (d) as agreement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the formula for Jaccard Similarity?

A

a/(a+b+c)

Does not count shared absences (d) as similarity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula for Sørensen-Dice Similarity?

A

2a/(2a + b +c)

Gives twice the weight to shared presences (a) compared to Jaccard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the formula for Ochiai Similarity?

A

a / (sqrt((a+b) * (a+c)))

Similar to Jaccard but accounts for sample sizes using a square root.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When should i use ochiai similarity eqn?

A

When massive difference in samples and groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

when do you want to use sorensen dice?

A

when a is small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

when do you want to use jarrcard?

A

when a is not small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when do you want to use simple matching?

A

when you think the absences (d) mean something