Module 6- Inferential Statistics Flashcards
Inferential Statistics
- what we do to make inferences about a population based on our sample
- how we test hypotheses
- make inferences from sample to population
Population
- the larger group of all participants of interest to the researcher
Sample
- subset of the population
- never represent the population perfectly due to sampling error
Sampling Error
- natural variability you expect from one sample to another
- not really an error
Population Parameter
- A descriptive Statistic (MCT , Variability measure)
- computed from everyone in the population
Sample Statistic
- Descriptive Stat (ex, mean) computed from everyone in the sample
- not a true representation of the population parameter bc of sampling error
- an approximation of the population parameter
- deviates away from the parameter bc of the sampling error
- close to parameter but not perfect
if we had an infinite amount of samples
- distributions of means of an infinite amount of samples would form a normal curve
- even if each sample itself is skewed, the plot of means would be normally distributed
Central Limit theorem
- if draw a large number of samples from a population at random, the means of those samples will make a normal distribution
- can never draw an infinite amount of samples
Sampling Distribution of the Mean
- plot or distribution of means from different samples of the same population
- makes a normal curve
Law of Large Numbers
- Larger the sample, the more the mean of each sample will approximate the mean of the population
- larger the sample, less the mean is impacted by outliers
- larger the sample, the smaller the SD of each sample and therefore each sample will have a mean similar in value
Characteristics of Sampling Distributions
- approximate the population mean
- approximately normal in shape
- can answer probability qs about the population
Standard deviation of the Sampling Distribution of the Mean
- Standard Error of the Mean
Standard Error of the Mean
- defines the variation around the population mean (u “mu”)
- percentage of data will fall within 1,2,3 standard error units from the mean
68% of the sample means fall within -/+ 1 standard error units from the mean of the sampling distribution
95% within -/+ 2 standard error units
99% within -/+ 3 standard error units - like the standard deviation
- difference due to chance
Can never obtain a Sampling Distribution of the Mean bc
- can never collect an infinite amount number of samples
- therefore, can’t collect the standard error of the mean
Confidence Intervals
- can estimate the standard error of the mean to calculate these
smaller the standard error
the smaller our confidence intervals will be
- want our standard error to be as small as possible
- want dis to be tall and skinny
Influences to the size of the standard error of the mean
- if variability of the variable is large within the population, then the standard error will be large
- if the variability of the variable is small within the population, then the standard error will be less and ^ have a tall and skinny distribution
- Law of large numbers; larger the sample size, the smaller the standard error due to less influence of outliers
Null Hypothesis
- for hyp testing
- no difference bw our sample and population mean (come from the same distribution)
- no difference bw 2 group means bc they come from the same pop mean
Reject Null Hypothesis
- what we want
- says the 2 groups are from 2 different population distributions
- there is a difference bw groups
Fail to reject null hypothesis
- comes from not enough evidence
- says 2 groups are from the same population distribution
- no difference exists
- say “fail to reject” bc can never prove anything true
test the null hypothesis by
- Test statistic
- test stat= observed difference/ difference due to change (standard error of the mean)
big or small test stat?
- want a big test statistic, so it can fall in the critical region of rejection (^ can reject the null hyp)
- observed difference would be a large number (numerator)
- want difference due to chance (denominator) to be very small
Z- distribution
- use to determine if our sample mean differs from the population mean
- if our observed Z values fell into the extreme regions, which is defined by alpha values then we reject the null hyp
- want a large z value to fall in the rejection region
what does a= 0.05 mean
- means 5 out of 100 times we are making an error
- 5% chance of incorrectly rejecting the null hypothesis when it is true; Type 1 Error
- as we lower the alpha value we are likely to be more confident
- results occur by chance less than 5/100 times
T-Test
- examines whether 2 group means come from the same population or different populations
- get more variable scores, but harder to see group differences
- want a large t value; bigger numerator ( between group difference) and a small denominator (w/in group diff/ standard error)
the further along the t and z values are in the distribution…
- closer to the extreme ends
- making it more unlikely to occur by chance
Between group Variance
- Treatment Effect
- Numerator of the test stat
Within group Variance
- Denominator of the test stat
- variability of scores within each group
_______ is key to Inferential statistics
- Variance
Variation in the DV could be due to
- chance/ error variance
- variance due to the IV
- confound variance
Chance/ Error Variance
- random variation in the DV that comes from individual differences
- inherent and cannot be eliminated
- bc random it does not impact the overall mean of the group
- contributes to within group variance
ex. ideally everyone in the noisy group would get the exact same score on the math test. However that is not realistic due to chance variance
Systematic variation
- influences the entire group mean
- creates between group differences
2 types; variance due to IV and confound variance
Variance due to the IV
- type of systematic variation
- good kind of variance
- what we want; this is the treatment effect
- comes from the manipulation of the IV
ex. manipulate the IV; noisy room vs quiet room. expect participants in the noisy room to make more math errors
Confound Variance
- type of systematic variance
- creates between group differences
- confounds; unintentional IVs
- not what we want, very bad; lowers internal validity
- acts like variance due to IV
ex. if the IV was the type of room (noisy vs quiet), but one room was hotter than the other, this is a confound. It contributes variance in the math errors between groups, but not the source of variance we want
F Ratio
- used to statistically test if the IV causes changes in the DV
- F= Between group variance/ Within group variance
= (IV variance/ treatment effect+ Confound
variance)/ Error(chance) variance
We want F ratio to be
- as big as possible
- want numerator to be as big as possible to maximize the b/w group variance
- denominator as small as possible to minimize w/in group variance
- makes us more confident there is a group difference in the DV
BUT numerator has to be big bc of the IV variance and not the confound variance **
small F ratio
- high within group variance
- fail to reject the null hyp bc no b/w group variance
F ratio= 1
- IV did not impact the DV; no treatment effect
- NULL HYP IS TRUE
- no b/w group difference
- all variation was due to chance variance
- numerator and denominator are the same value
F>1
- reject the null hypothesis
- see a difference in DV b/w groups
how to increase b/w group variance in F ratio
- increase the IV variance to make sure IV manipulations are causing a large effect size
how to decrease w/in group variance in F ratio
- cannot eliminate error variance
- but since denominator is like the standard error of the mean, we can reduce it by increasing our sample size
In each instance we are testing 3 hypotheses
- test the null hypothesis to see if our empirical observations are due to chance. If the F ratio is large, we have a large b/w variance and reject the null hyp
- bc we have b/w group variance, need to know if it is bc of IV variance or confounds
- after ruling out confounds and rejecting the null hyp, we can make inferences of the causal relationship of the IV on the DV
alpha value/ level of significance
- scientists feel comfortable at setting it 5% or lower
- we are saying that 5/100 times we are making an error, but this is not one of those times
- defines values that are unlikely due to chance
Probability Value/ P- Value
- for each test stat ( Z, T or F) we can calc a p value
if p value< set alpha value
- reject the null hypothesis and results are statistically significant
if p value > set alpha value
- fail to reject the null hypothesis and results are statistically non-significant
- results likely to have occurred by chance
Test the null hypothesis with four possible decisional outcomes
- 2; represent possible correct decisions
- 2; represent main errors- Type 1 or Type 2
Type 1 Error
- Incorrectly rejecting the null hypothesis, when the null hyp is actually true
- observed value due to chance
- even says it in the alpha value; “5/100 times we incorrectly reject the null hyp”
- never know when we make this error but are confident that setting a low alpha value will make this more unlikely of occuring
Type 2 Error
- IV did cause a difference, but we did’t detect it
- failing to reject the null hypothesis, when the null hyp is actually false
- occur when F is too small
Power
1-B
- probability we will reject the null hypothesis when it is false
- probability we will detect an effect
Probability of type 1 error
alpha (a)
probability of type 2 error
beta (B)
ideally we want type 1 and type 2 to be
- both very low
lower alpha in relation to type 1 and 2
- makes it harder to reject the null hypothesis
- makes it more likely you will miss a small difference in the effect
- makes type 1 error less likely
- but makes type 2 more likely; more likely you will fail to reject the null hyp, when the null is false and miss a small treatment effect
is type 1 or 2 more serious?
- Type 1 errors are considered more serious
- more serious to say IV had an effect when it didn’t
Effect Size
- how much the groups differ on the DV
- the effect of the IV on the DV
- the numerator of the F ratio
- NOT AFFECTED BY THE SIZE OF THE SAMPLE
relationship bw effect size and sample size
- if we have a small effect size, that means we have a small numerator of the F ratio. Because we want the F ratio to be large, we need to decrease the denominator by increasing the sample size
- if we have a large effect size, that means we have a large numerator of the F ratio. we can still have a large F ratio even if the denominator is isn’t very small ^ would not need a big sample size
- larger effect is easier to detect
statistical significance is a function of
- effect size and sample size
when power is low, what error is most common?
- Type 2 errors
increase the power by
- increasing the sample size
- and therefore can reduce type 2 errors
statistical significance
- observed group differences are unlikely to be due to chance or error
- done by increasing the sample size
practical significance
- ensure that the differences we observe have practical value in the real world
- is the treatment effect large enough to have value in a practical sense
- ex. new drug for depression lowers symptoms by one point; this is statistically significant but not pratical in the real world
reduce type 1 and type 2 error by
type 2 reduce; larger sample size
type 1 reduce; lowering alpha (a) the significance level