Midterm 1 Pt.2 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is sampling distribution?

A

The probability distribution of all values for an estimate that we might obtain when we sample a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the lowercase p with a hat stand for?

A

The sample-based estimate for p.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the relationship between a random individual having an attribute and the fraction of the population having that attribute?

A

The probability that a randomly selected individual will have that attribute is the SAME as the fraction (or relative frequency) of the population having said attribute.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is there a difference between showing the frequency on a histogram and the proportion on a histogram?

A

No, both are very similar.
The only difference is that with frequency the #’s are the total sample size and the proportion is on a scale of 0-1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is binomial distribution?

A
  • Applies when members of the population can be categorized into one of two categories (one of which we’ll arbitrarily consider a “success”)
  • describes the probability of a given number of “successes” from a fixed number of independent trials with constant probability of success in each trial.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do X, n, and p stand for?

A

X - number of successes
n - number of trials
p - probability of each success

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Equation for Pr[X]:

A

Pr[X] = (n!/(X!(n-X)!))p^X(1-p)^(n-X)

EXAMPLE CHAPTER 7 SLIDE 24

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a binomial test?

A

A hypothesis test in which the null distribution is provided by the binomial distribution.
OR
Uses data to test whether a population proportion matches a null expectation for the proportion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is H null in terms of the population proportion null?

A

The relative frequency of successes in the population/the proportion of the population with the attribute of interest IS p null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is H alternative in terms of the population proportion null?

A

The relative frequency of successes in the population/the proportion of the population with the attribute of interest is NOT p null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What can we use the null distribution to calculate?

A

The probability of having observed a result as extreme or more extreme as ours, under the working assumption that the null hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the probability that we calculate called?

A

The “P-value”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you calculate the P-value from the null distribution?

A

Addition of the outer groups on both sides of the curve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What can you do for calculating a symmetrical distribution?

A

Multiply the addition of one side by 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does little p versus big P stand for?

A

Little p is population proportion.
Big P is P-value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

READ PAGES 183-185 CAREFULLY

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the Law of Large Numbers?

A

Larger samples yield more precise estimates.
The improvement in precision as sample size increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the binomial distribution?

A

A type of probability model specific to the case of a random trial in which the probability of “success” is fixed for each outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is binomial distribution based upon?

A

These discrete probability distributions based on the binomial model take on different forms depending on n and p.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the hypotheses for discrete distribution?

A

H null: the data come from a particular discrete probability distribution.
H alternative: the data do NOT come from that distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the proportional null model?

A

The proportional model describes a probability distribution in which the frequency of occurrence of events is proportional to the to the number of opportunities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the steps to a hypothesis test?

A
  • State null and alternative hypothesis
  • Set your alpha level
  • Decide on appropriate statistical test (along with appropriate test statistic), and provide rationale
  • Check assumptions of the test
  • Look up ”critical value” of the test statistic, using appropriate value of alpha and “degrees of freedom”
  • Calculate observed value of test statistic, and compare to critical value
  • Draw conclusion, referring back to hypothesis, and report the type of test used, the value of the calculated test statistic, the degrees of freedom, and the P-value (and a confidence interval if appropriate)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does the goodness-of-fit test (X^2) do?

A

Compares counts to those expected under a particular discrete probability distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does X^2 change with?

A

The value of X^2 gets even larger with increasing discrepancy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are degrees of freedom?

A

The number of degrees of freedom of a test specifies which specific null distribution to use (out of a family of possible distributions).
These null distributions are continuous probability distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How do you calculate degrees of freedom for X^2 test?

A

df = (number of categories) - 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a critical value?

A

A “critical value” is the value of a test statistic that marks the boundary of a specific area in the tail (or tails) of the sampling distribution under the null hypothesis.

28
Q

Simplified Goodness-of-Fit definition?

A

A Goodness-of-Fit test compares observed frequency distribution with a frequency distribution that would be expected under a particular probability model.

29
Q

What are the assumptions of the X^2 GOF test?

A
  • no more than 20% of categories have Expected frequency <5
  • no category with Expected frequency <=1
  • each datum is random and independent
30
Q

What is a critical value?

A

The value of the test statistic where P=a

31
Q

The 5% critical value

A

The critical value is that value beyond which the area under the curve is equal to alpha.

32
Q

What is relative risk?

A

The probability of an undesired outcome in the treatment group divided by the probability of the same outcome in the control group.

33
Q

What does it mean if RR > 1?

A

Probability of the undesired outcome is higher in the treatment group than in the control group.

34
Q

What does it mean if RR < 1?

A

Probability of the undesired outcome is higher in the control group than in the treatment group.

35
Q

What are the odds of success?

A

The probability of success divided by the probability of failure. O = p/(1-p)

36
Q

What is the odds ratio?

A

The odds of success in one group divided by the odds of success in a second group.

37
Q

What is the odds ratio used?

A

Often used in medical studies, to measure the change in the odds for a response variable (2 categories: e.g. cancer/no cancer) resulting from medical intervention compared with a control group (the explanatory variable with 2 categories: treatment and control)

38
Q

What does it mean if OR > 1?

A

The event has higher odds in the first group than in the second group.

39
Q

What does it mean if OR < 1?

A

The event has higher odds in the second group than in first group.

40
Q

While relative risk and odds ratio allow us to estimate the magnitude of association, they do NOT directly test whether it can be caused by chance alone. What do we use to test this?

A

The X^2 contingency test.

41
Q

What is a X^2 contingency test?

A

The most commonly used test of association between two categorical variables.

42
Q

How do you calculate expected frequencies under a true null hypothesis?

A

Multiplication rule

43
Q

How do you calculate X^2?

A

X^2 = the sum of (observed-expected)/expected

44
Q

How do you state conclusions?

A

The nationality of the wine sold depended on what music was played at the store (c 2 test of independence; c 2 = 20.0, df = 1, P-value < 0.001).
OR:
The nationality of the wine sold was associated with the type of music played at the store (c 2 contingency test; c 2 = 20.0, df = 1, P-value < 0.001).
OR:
The nationality of the wine sold was contingent on the type of music played at the store (c 2 contingency test; c 2 = 20.0, df = 1, P-value < 0.001).
In parentheses: test used; value of test statistic; sample size or degrees of freedom; P-value.

45
Q

What test do you used when a contingency table with dimensions of 2x2?

A

Fisher’s Exact Test

46
Q

What is the Fisher’s Exact Test?

A

Provides the exact P-value from a 2 x 2 contingency table.

47
Q

How do you calculate expectations?

A

Exp[row i, column j] = (row i total)(column j total)/grand total

48
Q

What decides if you use a fisher’s test over an X^2 contingency test?

A

If there are too many values below 5.

49
Q

What is the normal distribution?

A

The normal distribution is a continuous probability distribution describing a bell-shaped curve. It is a good approximation to the frequency distributions of many biological variables (it is very common in nature).

50
Q

What is a probability distribution?

A

A probability distribution is the distribution of the variable in the entire population.

51
Q

What is a normal distribution fully described by?

A

Its mean (𝜇) and standard deviation (𝜎).

52
Q

What is the common rule for the mean, median, and mode in a normal distribution?

A

They are all the same.
- Side note: a normal distribution is symmetric around its mean.

53
Q

What is the common fraction of random draws that are within one standard deviation of the mean?

A

2/3 of random draws.

54
Q

What is the percentage of random draws that are within two standard deviations of the mean?

A

About 95% of random draws.

55
Q

Importantly, the normal distribution can also be used to approximate the sampling distribution of estimates, especially sample means.

A
56
Q

What does continuous probability distribution have to do with probability density?

A

The probability of a range of possible values is represented as area under the curve integrated from probability density.

57
Q

What is standard normal distribution?

A

A normal distribution with a mean of zero and a standard deviation of one.

58
Q

What does a standard normal table give us?

A

The probability of getting a random draw from a standard normal distribution greater than a given value.

59
Q

What about other normal distributions?

A
  • All normal distributions are shaped alike, just with different means and standard deviations.
  • Any normal distribution can be converted to a standard normal distribution, by: Z = (Y-u)/o
60
Q

Fun fact:

A

Z also tells us how many standard deviations Y is from the mean.

61
Q

How are probabilities of Z and Y connected in a standard normal distribution?

A

The probability of getting a value greater than Y is the same as the probability of getting a value greater than Z.

62
Q

How are normal distribution and sample mean related?

A

If a variable Y (mean) has a normal distribution in a population, then the distribution of sample means is also normal.

63
Q

What is sampling distribution?

A

The probability distribution of all values for an estimate that we might obtain when we sample a population.

64
Q

What does σ subscript Y with a line mean?

A

Standard error of the mean.

65
Q

Recall:

A

Larger samples yield smaller standard errors.

66
Q

What is the central limit theorem?

A

The sum or mean of a large number of measurements randomly sampled from a non-normal population is approximately normally distributed.

67
Q

What can the standard normal distribution be used to calculate?

A

The probability of drawing a sample with a mean in a given range.