NHST & Sampling Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Define sampling error.

A

The difference between the population value of interest and the sample value. Can be any quality or property of the data, like variance or mean; occurs bc the sample only represents an estimation of the actual population data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define sampling distributions

A

The distribution of a sample statistic (e.g., a mean) when sampled under known sampling conditions from a known population. Effectively the same statistic overlaid across several different trials.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define the null hypothesis

A

Any difference b/w sample and population statistic is due to sampling error. The sample and population both represent the same quantity; there is unlikely to be any real difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the alternative hypothesis

A

Any difference b/w sample and population statistic is probably not the result of sampling error. The sample and population do not represent the same quantity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define NHST

A

Significance tests are a broad set of quantitative techniques for evaluating the probability of observing the data under the assumption that the null hypothesis is true. Lets us decide if the null hypothesis is more probable than the alternative hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define statistical power

A

The probability of rejecting the null hypothesis when it is false, or correctly rejecting the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define a p-value

A

A probability value used to determine how likely it is to observe certain values based on sample error alone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define alpha value

A

The probability of rejecting the null hypothesis when it is true; called a significance level. Effectively the cutoff percentage for the risk of erroneously rejecting the null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain why the mean is an unbiased statistic and the variance is a biased statistic

A
  1. The mean is an unbiased statistic bc the typical sample mean is equal to the mean of the population. Any sample mean that differs from the population mean is equally likely to be arbitrarily high or low.
  2. The variance is a biased statistic bc the expected sample variance is usually smaller than the population variance. It does not capture the same value as population statistic.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain why sampling error occurs

A

Sampling error occurs bc the sample only represents an estimation of the actual population data. The sample could have different properties from the population, or misrepresent it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain what two problems sampling error causes in psychological research

A

1.) Our sample values might not be equal to the population values.

2.) Because of this obfuscation, we can run into a number of difficulties testing scientific hypotheses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain what the sampling distribution for the t-test is based on

A

T-test distribution: the sampling distribution is based on drawing random samples with known parameters. The means are then compared in relation to the assumed population mean; this lets us find the difference b/w the expected sample mean of the distribution and of the population when a sampling error is made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain what the sampling distribution for the ANOVA is based on

A

ANOVA distribution: the sampling distribution is based on the ratio of the population variance as estimated between groups vs. within groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain what the standard error of the mean is conceptually. What does the formula tell you about the relationship between sample size and sampling error?

A

Conceptually, the SEM is the standard deviation of a sampling distribution. The equation for SEM tells you that sampling error decreases as sample size increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain the basic logic of NHST.

A

If we make certain assumptions about the population (e.g., mu = 3) and the sampling process (e.g., random sampling, N= 25), we can determine:

a. ) the expected sample mean.
b. ) the expected difference between an observed sample mean and the population mean when a sampling error is made.

This means that we are evaluating a mean difference (Z-test), relative to how much we would expect means to differ on average (SEM).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain when to reject/fail to reject the null.

A

Reject null = probability of observing the difference < .05%

Fail to reject null = probability of observing the difference > .05%

17
Q

Explain what the role of the p-value is in NHST

A

Serves as a probability value that denotes whether or not a result would be considered statistically significant by convention. It helps you assess the result against the chosen critical value for the sampling distribution.

18
Q

Explain two issues to consider when you use a sample to draw conclusions about a population

A

1. The sample size is an estimation of the population, and therefore does not have the same properties as the population. Capturing the population value is our actual goal, so this presents some conceptual problems.

2. Depending on how we select our sample, our results can be due to biases in the sample, having an unusual sample population, or to chance alone. This is why the data needs to be able to be replicated in scientific studies.

19
Q

Explain the difference between the directional and non-directional hypotheses

A

Non-directional - Ha: μMean1 =/= μMean2

“There is a difference b/w the groups, but we do not know if our sample mean will be greater or less than the population.”

Directional - Ha: μMean1 > μMean2
“There is a difference b/w the groups, and we assume that our sample mean will be greater or less than the population by some amount.”

20
Q

Explain use of one or two-tailed tests when direction is or is not specified

A

The cutoff for the Z-score depends on the direction of the t-test. If it is non-directional, then we use a two-tailed test. If it is directional, we use a one-tailed test.

21
Q

Explain the conceptual definition of within group and between group variances (MSwithin and MSbetween)

A

a.) Within group variance/MSwithin: each of the three sample variances is an estimate of the population variance. We average the three variances using N - 1 to estimate the population variance. We are estimating the population variance separately within each sample or condition.

b.) Between group variance/MSbetween: we use the sample means in each condition to create a sampling distribution representing only those samples. We can then calculate the variance of these samples to estimate the variance of the sampling distribution of the means.

22
Q

Explain what is the F-ratio and why do we “want” to get a high value for this ratio?

A

F-ratio: the ratio of the population variance as estimated between groups vs. within groups.

We want to get a high value bc it demonstrates that we are sampling from groups that have different means w/ higher variance b/w those means. A value higher than 1 implies that the difference b/w population means in each group is increasing.

23
Q

If you get an F-ratio of 1, what can you conclude without even looking at an F table?

A

If you get an F-ratio of 1, you can conclude that the variance b/w each group is the same. In that case, any difference is due to sampling error and the means are the same across groups.

24
Q

What is statistical power and what is its relationship to Type 2 error?

A

1.) Statistical power is defined as the probability of correctly rejecting the null hypothesis when it is false.

2.) If there is more statistical power (e.g. larger sample size and larger effect), then we are more able to detect whether or not the null we rejected was actually false. This is conditional: 1 - β, where β = probability of failing to reject H0 when it is false.

25
Q

What does power depend on–what increases power?

A

Power is increased by a larger sample size and a larger effect (e.g. correlation, Cohen’s D, etc.).

26
Q

Explain what it means to commit a Type 1 and Type 2 error.

A

1. Type I Error: Erroneously rejecting the null hypothesis. Your result is significant (p < .05), so you reject the null hypothesis, but the null hypothesis is actually true.

2. Type II Error: Erroneously accepting the null hypothesis. Your result is not significant (p > .05), so you don’t reject the null hypothesis, but it is actually false.

27
Q

Explain the concept of factorial ANOVA, main effect, and interaction.

A

Factorial ANOVA - same as regular analysis of group variance test, except has two or more categorical independent variables and levels.

  • *Main Effect -** the effect of the variable averaging over all levels of other variables in the experiment. (e.g. mean of one variable is assessed against the other variable)
  • *Interaction Effect -** the effect of one of the variables differs depending on the level of the other variable. (e.g. smoking and drinking group is more dangerous combined than just smoking or just drinking group )
28
Q

Why do we estimate population variance in two separate ways?

A

We estimate it in two ways to calculate the f-ratio. The F-ratio contrasts the mathematically distinct calculations of the variance based on the sampling distribution and the variances in population conditions. If they are the same, it will equal 1, and the means/variances can be assumed to be the same.

29
Q

What do the two population variances represent?

A

MSbetween: calculate the variances using the grand mean formula to find the variance of a hypothetical sample distribution. This gives the variance of the means.

MSwithin: pool the variances of n number of samples bc they all should theoretically represent the same quantity. This gives the variance of the sample variances in all the necessary conditions.

30
Q

When we can we say a result (a difference) is statistically significant? Does that mean that it is also important?

A

When its p-value (aka. probability of difference in statistic occurring due to sampling error) is lower than .05%. This does not mean that the result is important or has any actual scientific significance; only that it exceeds the chosen critical value.

31
Q

Identify when to use each type of test (z-test, t-test, one-way ANOVA, factorial ANOVA, chi-squared) and explain why

A

1.) Z-test: compares the sample mean to the population mean. Must know both population mean and population standard deviation.

2.) T-test: compares sample mean to the population mean. Includes the population mean, but NOT population standard deviation.

3.) One-way ANOVA: used when there is an independent variable w/ more than one level. Estimates variance using MSbetween and MSwithin. Tells that there is a difference, but not what caused it.

  • *4.) Factorial ANOVA:** two or more independent variables w/ two or more levels. Tests main effects and interactions. Tells that there is a difference, but not what caused it.
  • *5.) Chi-squared:** used when there are two or more categorical variables. Tells association, but not what caused it.
32
Q

Identify common misinterpretations of NHST

A
  1. A p-value does not tell us that our findings are relevant, clinically significant or of any scientific value whatsoever.
  2. The null and alternative hypotheses are often constructed to be mutually exclusive. If one is true, the other must be false.
  3. Because NHSTs are often used to make a yes/no decision about whether the null hypothesis is a viable explanation, mistakes can be made.
33
Q

Compare a z-test to a t-test

A
  1. ) Z-test is a statistical hypothesis test that follows a normal distribution while T-test follows a Student’s T-distribution (multiple appropriated mean distributions).
  2. ) A T-test is appropriate when you are handling small samples (n < 30) while a Z-test is appropriate when you are handling moderate to large samples (n > 30).
  3. ) Z-tests can be used when SD is known; t-tests do not have a known population SD.
34
Q

Compare 3 types of t-tests

A

One Sample - compares the sample to the population. SD is unknown.

  • *Independent -** 2 separate samples are taken from different groups.
  • *Dependent -** 2 separate samples are taken from the same group and tested against each other. (e.g. post + pre-test)
35
Q

Compare critical values to p-values

A

A.) Critical Values: a point on the test distribution that is compared to the test statistic to determine whether to reject the null hypothesis. If the absolute value of your test statistic is greater than the critical value, you can declare statistical significance and reject the null hypothesis. Critical values correspond to α.

B.) P-values: the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis. Assesses against a distribution; tells us that the null is proven true roughly X percent of the time, and allows us to reject it when the probability of getting the null is below α.

36
Q

Explain why we subtract 1 from N (degree of freedom) when we calculate variance for inferential statistics

A

We want the average of the sample variances for all possible samples to equal the population variance.

The n - 1 correction is used bc the population mean is unknown; it inflates the results by making the sample size larger in comparison to the calculated variance. This is done to make it a more accurate estimate of the population variance, and is why it is understood as “unbiased.”

37
Q

What two concepts are related to taking n - 1 to equal the population variance?

A
  1. We subtract N - 1 bc the observations are always closer to the sample mean than the actual population mean.
  2. The mean is a least squares statistic; it will always make the resultant value as small/precise as possible. We want to minimize the error already included as part of the original squared estimation.
38
Q

What does effect size allow us to do w/ respect to sampling error?

A

The larger the effect that exists between variables,
the easier it will be to detect that effect when we conduct our study.

39
Q

What does an estimated effect size help us do?

A

Choose an appropriate number of subjects so we can ensre that our research designs have more statistical power.