Stats 216 Flashcards

1
Q

What is “Bootstrapping”?

A

Bootstrapping is a method for answering the question:
“How can we estimate sampling variability if we only have one sample?”

Suppose we are answering the question “What proportion of breakups occur on Monday?” but we only have one sample of 50 breakups as reported on Facebook. From this sample we can see that 26% of these breakups (13 out of the 50) happen on a Monday. Since this is just one sample of 50 people, we know that this is very unlikely to be exactly correct– overall it will be close to 26% of all breakups that will happen on a Monday, but we don’t know what it will be in general for the entire population. We can estimate the proportion for all breakups by using the Bootstrap method.

We use our sample, 13 out of 50 breakups, and then select a breakup at random from this population one at a time to create a new sample of 50 breakups. Every time we sample, we replace it (sample with replacement). This allows us to create as many samples as we want. In other words we randomly re-sampling with replacement to create a new sample. This can be repeated as many times as desired.

https: //umontana.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=29c86e39-c1f7-4fde-bb50-adbb014fd399
https: //www.youtube.com/watch?v=4ZLHFSzCmhg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are “marginal” and “conditional” distributions?

A

Marginal and conditional distributions can be found the same two-way table.

Marginal distributions are the totals for the probabilities. They are found in the margins (that’s why they are called “marginal”).

The following table shows probabilities for rolling two dice. The total probabilities in the margins are the marginal distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 6 principles regarding p-values from the American Statistical Association

A

Principle 1: P-values can indicate how incompatible the data are with a specified statistical model.

Principle 2: P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

Principle 3: Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.

Principle 4: Proper inference requires full reporting and transparency.

Principle 5: A p-value does not measure the size of an effect or the importance of a result.

Principle 6: By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Write a null hypothesis for “Are teens better at math than adults?”

A

Being a teenager or adult has no effect on mathematical ability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Write a null hypothesis for “Does taking aspirin every day reduce the chance of having a heart attack?”

A

Taking aspirin daily does not affect heart attack risk.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Write a null hypothesis for “Do teens use cell phones to access the internet more than adults?”

A

Being a teenager or an adult has no effect on the use of cell phones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Write a null hypothesis for “Does the color of cat food affect the cats choice of which food they eat?”

A

Cats express no food preference based on color.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Write a null hypothesis for “Does chewing willow bark relieve pain?”

A

Chewing willow bark has no effect on pain relief.

There is no difference in pain relief after chewing willow bark versus taking a placebo.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you know if you have run enough simulations using a Monte Carlo simulation?

A

The confidence interval is calculated from the sample’s size and standard deviation and the chosen confidence level (typically 90%, 95%, or 99%).
Running even more samples will narrow the confidence interval.
Too few samples and you get inaccurate outputs, graphs (particularly histogram plots) that look ‘scruffy’;

Too many samples and it takes a long time to simulate, and it may take even longer to plot graphs, export and analyze data, etc afterwards.

If the output of greatest interest is graphical, you will need a plot that would not change to any meaningful degree by running more samples (i.e. it is stable).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a “Bernoulli Trial”

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a “Binomial Random Variable”?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a “parameter”? What is a “statistic”?

A

A “parameter” is characteristic of a population.

A “statistic” is any measurement we make from a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a “population”? What is a “sample”?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why and when do we use bootstrapping?

A

We use bootstrapping to estimate sampling variability. We need to do this when we only have one sample.

For example, if we only had one sample of 50 breakups reported on FaceBook and wanted to generalize to all of the breakup reported on Facebook. We could use bootstrapping to determine the variability (get a measure of reliability) of our prediction based on our single sample of 50 breakups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Suppose you made an estimate about a population parameter using a statistic from a sample of the population. How would you quantify the reliability of your statistic?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a margin of error quantify?

A

A margin of error quantifies the uncertainty in the estimate. It tells us the amount of “give or take” around the sample estimate that is reasonable.

Margin of Error=2×SD of sampling distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a “compatibility interval”?

A

A compatibility interval is the range of values for which you can be 95% certain that the population parameter will fall into.

Compatibility Interval = Sample Estimate±Margin of Error

= Sample Estimate ± 2xStandard Deviation of Sample

Note: While statisticians and polling organizations tend to use two SDs to compute the margin of error, this is a somewhat arbitrary choice. Some researchers choose one or three SDs of the sampling distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In any statistical estimate, we are concerned with two things:

A

The estimate and the uncertainty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

To evaluate the external validity evidence, we need to consider…

A

… representativeness and uncertainty.

Representativeness. Is the sample representative of the population?

Uncertainty: Did the researchers account for uncertainty in the estimate?

In the case of the average global temperatures lesson:

Yes. This study used a random sample of points on the Earth. This is an unbiased sampling method, which means that the sample is a representative sample.

Yes. In this case the uncertainty comes from sampling variability. We accounted for this uncertainty with the margin of error and compatibility interval.

Notes:

When evaluating external validity, make sure that your response attends to both representativeness and uncertainty.

For representativeness, you should focus on whether the sampling method is biased. This study uses random sampling, which is an unbiased method

For uncertainty, you should consider sampling variability. In this case we estimated sampling variability using the bootstrap model and we accounted for the uncertainty from sampling variability with our margin of error and compatibility interval.

Also, please be sure that your evaluation of external validity does not attend to extraneous factors other than representativeness and uncertainty.

20
Q

Write Null Hypothesis

Are teens better at math than adults?

A

Age has no effect on mathematical ability.

21
Q

Write Null Hypothesis

Does taking aspirin every day reduce the chance of having a heart attack?

A

Taking aspirin daily does not affect heart attack risk.

22
Q

Null Hypothesis Practice

Do teens use cellphones more to access the internet than adults?

A

Being a teenager or an adult does not effect how often cellphones are used to access the internet.

23
Q

To attribute a causal relationship, there are three criteria a researcher needs to establish:

A
  • Association of the Cause and Effect: There needs to be an association between the cause and effect. (We do this via Hypothesis Testing)
  • Timing: The cause needs to happen BEFORE the effect.
  • No Plausible Alternative Explanations: ALL other possible explanations for the effect need to be ruled out. Random assignment removes any systematic differences between the groups (other than the treatment), and thus helps to rule out plausible alternative explanations.
24
Q

What are “sampling” and “assignment”?

A

Sampling refers to how participants were selected from the population.

Assignment refers to how the selected participants (participants in the sample) are assigned to comparison groups. For example, treatment group or control group.

25
Q

What is meant by “validity”?

A

Validity is the degree to which inferences and conclusions are meaningful and accurate.

26
Q

Rate the internal and external validity of the following.

Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 100 individuals to be in the study by using a computer to randomly select 100 names from a database. Once they have the 100 individuals, they once again use a computer to randomly assign 50 of the individuals to a control group (e.g. stick with their standard diet) and 50 individuals to a treatment group (e.g. follow the new diet). They record the total weight loss of each individual after one month.

A

The researchers used random selection to obtain their sample and random assignment when putting individuals in either a treatment or control group. By doing so, they’re able to generalize the findings from the study to the overall population and they’re able to attribute any differences in average weight loss between the two groups to the new diet.

27
Q

Rate the internal and external validity of the following.

Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 100 individuals to be in the study by using a computer to randomly select 100 names from a database. However, they decide to assign individuals to groups based solely on gender. Females are assigned to the control group and males are assigned to the treatment group. They record the total weight loss of each individual after one month.

A

The researchers used random selection to obtain their sample, but they did not use random assignment when putting individuals in either a treatment or control group. Instead, they used a specific factor – gender – to decide which group to assign individuals to. By doing this, they’re able to generalize the findings from the study to the overall population but they are not able to attribute any differences in average weight loss between the two groups to the new diet. The internal validity of the study has been compromised because the difference in weight loss could actually just be due to gender, rather than the new diet.

28
Q

Rate the internal and external validity of the following.

Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 100 males athletes to be in the study. Then, they use a computer program to randomly assign 50 of the male athletes to a control group and 50 to the treatment group. They record the total weight loss of each individual after one month.

A

The researchers did not use random selection to obtain their sample since they specifically chose 100 male athletes. Because of this, their sample is not representative of the overall population so their external validity is compromised – they will not be able to generalize the findings from the study to the overall population. However, they did use random assignment, which means they can attribute any difference in weight loss to the new diet.

29
Q

Rate the internal and external validity of the following.

Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 50 males athletes and 50 female athletes to be in the study. Then, they assign all of the female athletes to the control group and all of the male athletes to the treatment group. They record the total weight loss of each individual after one month.

A

The researchers did not use random selection to obtain their sample since they specifically chose 100 athletes. Because of this, their sample is not representative of the overall population so their external validity is compromised – they will not be able to generalize the findings from the study to the overall population. Also, they split individuals into groups based on gender rather than using random assignment, which means their internal validity is also compromised – differences in weight loss might be due to gender rather than the diet.

30
Q

Population

A

A (target) population is the complete set of individuals in which we are interested.

31
Q

Sampling Fram

A

A sampling frame is a list, map, or other specification of individuals in the population.

32
Q

Sample

A

A sample is a subset of a population.

33
Q

parameter

A

A parameter is a numerical fact about a population

34
Q

statistic

A

A statistic is a numerical fact about a sample.

35
Q

Bias

A

Bias is a systematic error in the sampling, measurement, or estimation procedures that results in a statistic being consistently larger or consistently smaller than the parameter it estimates.

36
Q

Selection bias

A

Selection bias occurs when the target population does not coincide with the sampled population.

37
Q

Convenience samples

A

Convenience samples – easy to find these sampling units.

38
Q

Judgement samples

A

Judgment samples – “This group reflects the population.”

39
Q

Self-selected samples

A

Self-selected samples – Volunteers

40
Q

Undercoverage

A

Undercoverage occurs when the sampling frame fails to include some members of the target population.

41
Q

Overcoverage

A

Overcoverage occurs when units not in the target population are included in the sample.

42
Q

Nonresponse Bias

A

Nonresponse bias – the failure to obtain responses from some members of the chosen sample, those members being “different” in important ways from the members who do respond.

43
Q

Response Bias

A

Response bias – occurs when a response to a survey tends to systematically differ from the true value.

44
Q

Deciding on Hypothesis Test

I want to test whether a sample mean (of a normally distributed interval variable) significantly differs from a hypothesized value.

A

A one sample t-test allows us to test whether a sample mean (of a normally distributed interval variable) significantly differs from a hypothesized value.

45
Q

Deciding on Hypothesis Test

I want to whether a sample median differs significantly from a hypothesized value.

A

A one sample median test allows us to test whether a sample median differs significantly from a hypothesized value.

46
Q

Deciding on Hypothesis Test

I want to test whether the proportion of successes on a two-level categorical dependent variable significantly differs from a hypothesized value.

A

A one sample binomial test allows us to test whether the proportion of successes on a two-level categorical dependent variable significantly differs from a hypothesized value.

47
Q

Deciding on Hypothesis Test

I want to test whether the observed proportions for a categorical variable differ from hypothesized proportions.

A

A chi-square goodness of fit test allows us to test whether the observed proportions for a categorical variable differ from hypothesized proportions.