exam 2 Flashcards
Problems with z scores
Z test procedure– use one sample mean from a known population to test hypotheses about an unknown population
To use a z score we need to know the pop mean and std deviation from which we draw our sample
We often have a solid idea of what the populations mean should be
We have to know population std deviation, but we usually don’t
Solution to this it the t statistic
T statistic
test hypothesis about an unknown population when the value of the population std deviation is unknown
We estimate it with our data
Formula is very similar to the z score formula
Main difference– use and estimated standard error in the denominator
Estimated standard error
estimate of the real standard error when the population std deviation is unknown
We don’t have info about population, but we do have info about the sample so we use that in place of the population
estimated standard error goal
prove an estimate of the standard difference between a sample mean and the population mean
Estimated slightly differently across three types of t tests
All have slightly diff estimation, but they all follow the same formula
One sample t-test
comparing the mean of one sample with a known population mean
Same logic as a z test
We still don’t have any control group– its one sample and one group and wanting to see if after applying treatment, the group still belongs to the known population
Diff between z-test– estimating the population std deviation/variance
Null and alternative hypothesis remain the same
one sample t-test examples
Are the starting salaries of CSU grads different from the national average (63,795)
We know the mean– 63,795 and want to see if CSU grads match that
Did our participants score differently from the median scale point of 25
Independent samples t-test
compare mean of one group with the mean of a different group
Most typical one used in psych
Allows researchers to evaluate the mean difference between two population using data from two separate samples
Want to see if the two samples you are comparing belong to two different populations
Looking at the difference between means
independent samples t-test null, alt, and cohens d
Null hypothesis– two samples come from the same underlying population– there’s no real differences between them
Assuming that the difference is zero
Alternative hypothesis– two samples come from different populations
Cohen’s d
Taking sample mean difference over pooled sample std. Deviation
Assumptions of independent samples t-test
The two populations are normally distributed
Can assure this by a big sample size (at least 30)
Two samples are random samples from their populations
Homogeneity of variance
States the two populations you are estimating have the same variance (spread around mean)
Necessary to justify pooling
Solution– make sample sizes equal
There is never a case where you can safely violate the homogeneity of variance assumption
False
violated when you have even sample sizes and doing the welch two samples test
independent samples t test example
do men and women have different emotional responses to romantic comedy movies?
Paired samples t test
compare mean of one group with the mean of a group that is matched or connected to the first in some way
Ex– couples because they are related to each other in some way
Could be over time, or just genuinely related to each other
paired samples t test repeated measures
participants measured in two conditions (or two time points)
Dataset consists of two scores on one variable per individual
All participants should be scored twice on the same variable to see if there’s some improvement
Like a Quasi experiment
paired samples t test matched pairs
different participants in each condition, but they are matched somehow
Dataset consists of two related groups who are scored on the same variable
Could be related or a case like people that have the same IQ
Ex– is there a difference in eating behavior between mothers and daughters
paired samples t test logic
Everyone in the population is tested at baseline
Everyone in the population is then treated
Everyone in the population is tested after treatment
We want to assess whether is a systematic change in the same population between the first and second measures
paired samples null and alt
Null hypothesis– the population of difference score has a mean of zero
Alternative hypothesis– the population of difference scores does not have a mean of zero
Degrees of freedom
Problem with t-test– we are estimating the populations standard error with our samples standard error
We don’t know how close our estimate is to std deviation
To compute the sample variance, we use the sample mean
This restricts the amount of variance we can have
Degrees of freedom is the number of scores in a sample that are statistically independent and free to vary
Statistically free to vary = values not restricted by sample mean
essentially just sample size -1
The higher your degrees of freedom, the morse accurate your estimates standard error, and thus, t-statistic, will be
Higher sample sizes (degrees of freedom) will be more accurate in terms of how they represent the population
The first numbers can be literally anything, but once we know the mean, the last number is fixed
Important
T distribution
sampling distribution in a t test is still a distribution of all possible sample means
t distrobution Comparison between z-test and t-test
A score distribution is the set of z scored for all possible means of a particular sample size (n)
A t score distribution is the set of t-scores for all possible sample means of a particular degrees of freedom (n-1)
Family of distributions of all possible degrees of freedom
Pick the one you want to use based on samples degrees of freedom
Key– we can use the t-distribution in the same way we used the z distribution if we know the degrees of freedom
T-distribution shape
Shape differs across possible degrees of freedom
Different sample sizes = different degrees of freedom = different t-distributions
Generally looks pretty normal if sample size is 30
Greater degrees of freedom, the more normal it will be
Probability and the t distribution
We can determine probabilities of extreme scores in t distributions we did for z distributions
Use a t distribution table
Use R studio
T scores of greater absolute value are more extreme, and therefore more rare
One sample t-test assumptions
Observations in our sample are independent
Score of one observation has no bearing on another observation
If they are dependent on one another, can use paired samples t-test
Comparing to a known value
Ex– would be inappropriate if a sample was a mother and daughter
Population sample must be normal
Often violated– rare to see a truly normal population distribution
T-tests are robots to this violation when sample sizes are at least n=30
One-sample t-test hypothesis testing framework
Step one– state null and alternative hypothesis
Step two– set your alpha level and find your critical region/value
Step three- calculate test statistic
Step four– compare t to the critical t and make a decision
practical significance
is this effect big enough to matter?
Even a small effect might be practically important in the right context
Statistical significance
(rejecting the null hypothesis) is not the same as practical significance
Huge sample means higher power to reject the null when the effect is very small
Effect size
measurement of the absolute magnitude of a treatment effect, independent of the sample of the samples being used
Sample size doesn’t matter, just focusing on how big the effect was
For t tests we can use cohen’s d
Effect size measure we use to compare two means (t-test)
Measures weather or not the mean difference matters
Interpreting effect sizes
Cohen suggest guidelines for interpreting cohen’s d
Negligible– 0-.19
Small– .20-.49
Medium–.50-.79
Large– .80
Not the best because it all depends on context
DFs effect on power
When you select a score as a cutoff point between the body and tail, it will be more extreme in a normal distribution than a t-distribution
Interpretation of standard error of mean differences
independent samples t test
Measure of average distance between a sample statistic and the corresponding population parameter in the sampling distribution
Uneven samples
(interpretation of std error of mean differences)
when sample are diff sizes, the larger sample provides a better estimate of the variance than the smaller sample
Solutions– pooled variance
Use pooled variance to calculate the standard error
If the samples are equal, both the pooled and unpooled formulas will derive the same result
If they are not equal, the pooled will be better
Independent sampling distributions
distribution of differences between the means
Diff between the mean is different from a different score
Ex– mean difference between two different samples
Control sample mean - experiment sample mean
Determines probability of mean difference scores
paired sampling distributions
distribution of means of difference scores
Ex– time 1 score - time 2 score