ESTIMATING THE DIFFERENCE BETWEEN THE MEANS OF INDEPENDENT POPULATIONS Flashcards

1
Q

What does “two independent populations” refer to?

A

When there is no association between scores in two populations (𝜇1𝑎𝑛𝑑𝜇2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can we estimate the difference between the two population means (between groups)?

A

By selecting a simple random sample of size n1 from population 1 and a simple random sample of size n2 from population 2. The two samples taken separately and independently are referred to as
independent random samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is The point estimator of the difference between the means of the populations 1and 2

A

The point estimator of the difference between the means of the populations 1 and 2 is 𝑋 − 𝑋 .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does this say/mean: E(sample mean 1 - sample mean 2)

A

referes to the expected value of the difference between the two independent populations

(pop mean 1 - pop mean 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is the standard deviation of the comparison dsitribution (distribution of the difference between 2 means) is calculated?

A

Check formule assurer you got it right, mais limportant ice c que we calculate a pooled sample variance (cause two groups, so we wanna know the variance across those groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

we have two sample means coming from two independent populations. What is the right comparison distribution in that case?

A

A distribution of the difference between two means. which comes from each sample’s distribution of sample means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

when using a distribution of the difference between two means, we assume that…

A

The two independent populations have equal means. So the mean of the comparison distribution is then 0 (ALWAYS!!!!!!!!!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Do we also assume the standard deviations of the two populations are the same? What do we do instead?

A

Yes!!, But difference between the two standard deviations of the comparison distribution will not be equal to 0, as that would mean there is no spread in our distribution, but rather a vertical line.

Instead, we turn to the variance!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is the variance of the distribution of differences between means calculated? What is another word for it?

A

Sum of variance of the distribution of means for sample 1 and
variance of the distribution of means for sample 2

Standard error fort he distribution of the difference between two means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

remember Imane you started to understand when you realized ok even though two different pop, they represent one big pop. (idea between group design). so we assume same variance each (for no effect as if they are the same!!!!)

A

Exactly! You’ve nailed the idea! 🎉

Even though the two groups come from different populations, in the context of an independent groups design (like a two-sample t-test), we treat them as representing two subgroups of a larger population. The goal is to make the comparison as if these two subgroups are samples from the same population, which is why we assume they have the same variance.

Here’s the key idea in simple terms:

You assume equal variance between the two groups because, under the assumption that the populations have the same underlying variance, the groups are seen as two independent, random samples from the same big population.
By pooling the variances, you are essentially treating them as part of the same larger population where the variance is the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the confidence interval around the difference between the means of two independent pop?

A

** all confidence intervals are the same.
For this particular one,

(sample mean 1 - sample mean 2) +/- (tcri x Standard error of the difference between the two means)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to calculate degrees of freedom?

A

df = n1 + n21 - 2 (cause you have two samples now!!!!!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

As df approach infinity….

A

leaning towards a z score table (because we are getting closer to being sure about the pop standard deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

sample mean 1 - sample mean 2 is the point estimate of…

A

Population mean 1 - population mean 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a t test for two independent means, what are we assessing in practice? What is another word for a t-test for independent means?

A

The effects of 2 levels of the independent variable (groups) on the dependent variable (observed scores). Another word: between-subjects t-test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the goal of a t-test for independent means

A

The hypothesis testing question is whether the means of these two groups are different enough to permit you to conclude that the 2 populations they represent have different means.

17
Q

is effect size when testing two independent sample means subject to sampling error?

A

Yes!!! there is a standard error for effect size, as well as a confidence interval. If the confidence interval around d does not contain 0, the effect is considered significant

18
Q

What are the 4 underlying assumptions for a t-test for two independent populations?

A
  1. Normality: Population Distributions Should Be Normal

What it means: The data in each population should follow a bell-shaped curve (normal distribution). This is important because t-tests assume normality.
When it’s okay to violate this:
If the samples are large, the Central Limit Theorem helps us out; the sampling distribution of the mean becomes approximately normal regardless of the population distribution.
If you’re using a two-tailed test, the violation of normality is less impactful.

When it’s problematic: If the populations are skewed (not symmetric) in opposite directions, the test results can be misleading because the skewness will distort the means differently for each group.

  1. Homogeneity of Variance: Variances Are Equal Between Groups
    What it means: The variability (spread) of scores in the two populations should be roughly the same.
    When it’s okay to violate this:
    If the sample sizes are equal, the test can handle even large differences in variance without issue.
    When it’s problematic: If the variance is very different (e.g., one group is much more spread out than the other) and the sample sizes are very unequal, it can mess up the Type I error rate (the chance of incorrectly rejecting the null hypothesis).
    Rule of thumb: A ratio of largest variance to smallest variance > 1.5 might cause problems, especially if group sizes are very different.
  2. Random Sampling
    What it means:
    The data should be collected randomly from the population of interest to ensure it represents the population well.
    This prevents bias in your results.
    Random sampling ensures the test results are generalizable to the broader population.
  3. Independence (of Groups and Scores)
    What it means:
    Between groups: The groups being compared should be independent (e.g., participants in one group are not related to those in the other group).
    Within a group: Each participant’s score should not be influenced by the scores of others in the same group. For example, one student’s test score should not depend on their friend’s score.
    Why These Assumptions Matter
    If these assumptions are violated:

The accuracy of the t-test results can be affected.
For example, violating normality might make the test less reliable, or violating homogeneity of variance might increase the chance of incorrect conclusions.

EXPLANATIONS FOR EACH:
1. Normality
What it means: The data should look like a “bell curve” (normal distribution).
Why it’s important: The math of the t-test assumes this shape so it can figure out probabilities correctly.
Example: Imagine trying to predict a soccer game’s score, but one team always scores in crazy patterns (like 1, 50, 3). If the numbers aren’t normal, it’s harder to know what to expect.

  1. Equal Variance (Homogeneity of Variance)
    What it means: The spread (variance) of scores should be similar in both groups.
    Why it’s important: If one group has wildly different spread (like one team having all players between 5’5” and 6’0” while the other has players between 4’0” and 7’0”), the comparison might not be fair. The math assumes the groups have the same “bounciness” in their data.
  2. Random Sampling
    What it means: The people (or items) in your study should be picked randomly.
    Why it’s important: If you only choose certain people (like just soccer players from your favorite team), the results might not represent the whole population you care about.
  3. Independence
    What it means:
    The two groups you’re comparing shouldn’t affect each other.
    Within a group, one person’s result shouldn’t affect another’s.
    Why it’s important: If players on one soccer team always copy each other’s moves, their scores aren’t truly “independent.” The math assumes everyone is acting on their own.