Introduction to Inferential Statistics Flashcards
How can we infer something about the population based on what we find in the sample?
Normal distribution
Empirical rule
Central limit theorem
Are t-values and standard errors used to estimate population parameters?
Yes
What are inferential statistics used for?
To draw conclusions and make inferences about population parameters by analyzing data collected in a sample
To infer something about the population parameter using sample statistics
Will parameter estimates based on a sample exactly equal the population parameter?
No, because they vary from sample-to-sample
So, when reporting a statistic, we also typically report an interval (e.g., 95% confidence interval) which we believe includes the population parameter
What is a confidence interval?
A range of values within which a population parameter (e.g., mean or proportion) is likely to fall, based on a sample from that population
If you repeated a study of the same size many times, 95% of the resulting confidence intervals would cover (include) the true population parameter
Do statistics estimate the sample?
Yes
And parameters estimate the population
Since samples describe the population, do statistics describe the parameters?
Yes
What notations are used for samples?
Number of people: n
Mean: X (with a bar on top)
Variance: s^2
Standard deviation: S or SD
What notations are used for the population?
Number of people: N
Mean: mu
Variance: sigma^2
Standard deviation: sigma
What are the primary characteristics of a normal distribution?
Symmetric and unimodal (one peak)
Why is a normal distribution the most important distribution in inferential statistics?
It’s characteristics form the foundational assumptions underlying many interferential statistics
Can you apply the empirical rule to any normal distribution?
Yes
What is the empirical rule?
68% of the observations fall within 1 standard deviation of the mean
95.4% of the observations fall within 2 standard deviations of the mean
99.7% of the observations fall within 3 standard deviations of the mean
If something has a normal distribution, can you use the central limit theorem?
Yes
What are the three distributions?
Sample distributions (distribution of the sample)
Population distribution (distribution of the population)
Sampling distribution (distribution of a statistic over a set of of theoretical samples; distribution of sample means; plotted means of various samples)
Is the sampling distribution of the sample means approximately normal?
Yes
The sampling distribution of the sample means becomes “more normal” as n (or number of samples) increases
The mean of the sampling distribution of sample means will be the same as the mean of the population
Since the distribution of the sample means is approximately a normal distribution, can the empirical rule be applied?
Yes
Do we know that 95% of all sample means are within 2 SDs of the population mean?
Yes, based on the empirical rule
This theoretical assumption is the basis for inferential statistics
What is the standard error of the mean?
Analogous to the SD of the population data
SEM = SD of the sample/square root of the sample size
How is the standard deviation related to the sample error of the mean?
Just as the standard deviation increments the distance of a raw score from the population mean, the SEM increments the distance of the sample mean from the population mean
What will make the SEM change?
The SD of the sample
The size of the sample
Why are the standard error (SD) increments of the distribution of the sample means so much narrower?
Because we are just plotting means, not raw scores
Outliers are not applied
What are z-scores?
Indicate how far a score is from the mean in a population context
Raw scores incremented by standard deviation units
A z-score indicates how far a raw score is from the population mean
What are t-scores?
Indicate how far a score is from the mean in a sample context
Sample means incremented by standard error units
A t-score indicates how far a sample mean is from another mean
Sample means are distributed following the t-distribution
Do t-scores differ depending on sample size?
Yes
Is there a different t-distribution for each discrete number of degrees of freedom?
Yes aka sample sizes
As the sample size increases, does the t-value get closer to the z-score?
Yes
What is a confidence interval?
The interval (range of scores) which we believe includes the population parameter
We can be 95% confident that ______ on average between…
Since we rarely know the population mean, what do we use instead?
the mean score of our sample
the standard error of the mean
the assumption that the sampling means are normally distributed (so we can apply the Central Limit Theorem)
What is a critical t value?
The “critical value” is the cutoff point for 5% of the distribution
A test statistic value (t) exceeding the cut-off point is statistically significant, p<0.05
What do narrower confidence intervals indicate?
The more precisely we have estimated the population parameter
Generally, the width of the confidence interval is inversely related to the size of the sample
What are the four sources of variability?
Differences attributable to group membership (between group)
Sampling error (within group)
Measurement error (within group)
Individual differences (within group)
What is important to determine when comparing groups?
Does one group constitute their own population
If 95% CI do not overlap between groups, the difference is significant (p<0.05)
If the 95% CI do overlap, is the difference significant?
No, p>0.05
How do you calculate the critical t score when you are comparing groups?
Based on the total number of subjects minus 2
*used when calculating the 95% CI around the mean difference for two groups
What are the two approaches for comparing the 95% CI of two groups?
Do CI for each group separately and then see if they overlap
95% CI around the mean difference (CI for the difference in mean between groups; if the interval crosses zero, they are not significantly different)