t tests Flashcards
two ways to generate two gets of scores
- repeated measures design
- independent groups design
repeated measures design
- every subject is exposed to each treatment condition and scores measured
- comparison are between scores for the same individuals under different conditions
- analyze using paired t test
independent groups design
- each subject is exposed to a single treatment condition and scores measured
- comparisons are between scores from different individuals under different conditions
- analyze using the independent t test
what are some sources of variation between scores?
effects of treatment
individual differences
- differences in baseline score
- differences in responsiveness to treatment
effects associated with uncontrolled variables
measurement error
repeated measures eliminates variations due to individual differences!
repeated-measures calculated difference (D) scores
Di = yBi — yAi
difference= (condition B) - (condition A)
this converts two sets of scores (conditions A and B) into a single set of scores (D)
- can compare and see if there’s variation between scores or variation between D scores
is there an effect of treatment?
Is there a difference between scores measured under the two conditions?
no effect of treatment, the sets of scores are identical
Di = yBi - yAi = 0
we wouldn’t expect all D scores to be exactly zero, but the mean of all D scores should approx to 0
null hypothesis: H0: muD=0
how to test null hypothesis
determine the probability that observed sample mean would have been obtained from a population where muD=0
- p value is the probability of obtaining observed data, if H0 is true
2 approaches to test H0
- use sample data to obtain a sampling distribution (similar to DSM) with a mean of D-bar
- determine location of muD=0 (H0) within this distribution
- calc 95% CIs - position the same sampling distribution with a mean of muD=0 (H0)
- determine location of observed D-bar within this distribution
- hypothesis testing
GLM for D scores
GLM is:
Di= D-bar + error
based on D scores, not raw scores
bootstrapping 95% Cl for muD
- use observed D scores to generate an infinite hypohtesis population of D scores
- randomly sample n=16 D scores (match original sample size) from pop and calculate D-bar
- repeat many time, generating new random sample and calculating D-bar
- generate a distribution of D-bar
- instead of distribution of sample means (DSM), this is the distribution of mean differences (DMD
bootstrapping 95% Cl in R
boot.t.test ()
- calculates standard deviation of DMD (standard error)
- defines a precise p value for probability of obtaining observed data, if null hypothesis is true
how to test H0
DMD estimates the distribution of all sample means that would be obtained from a population that matched our sample
we can located 0 in the distribution and focus on difference between D-bar and 0
subtract value of D-bar from all values in distribution
- calculates difference between D-bar and 0 but now 0 is the mean of the DMD
If D-bar - muD is small == observed data are likely if H0 is true
If D-bar - muD is large == observed data are unlikely if H0 is true
standardizing D-bar - muD
convert scores to z scores
zyi= (yi-y-bar)/(Sy)
subtract the mean of the DMD (0) then divide by standard deviation of the DMD
standard deviation of DMD represents the average value of the D-bar - muD
- standard deviation seeks to calculate the average deviation score
standardized values tell us the average difference that we would expect between D-bar and mu=0 if H0 is true
- if the observed difference between D-bar - muD is twice as large as the average difference that would be expected due to sampling variation, if H0 is true
central limit theorem and t statistics
t= D-bar / (Sd-bar) = D-bar / (Sd/ sqrt n)
if t=2, the observed difference between D-bar and H0 is twice as large as the average difference expected due to sampling variation
t distribution
start with noramly-distributed pop of D scores with muD=0
- pop has normal distribution, assumption of CLT
- pop can have nay standard deviation as the t statistic standardizes the values of D-bar - mud based on the observed value of Sd
define sample size (n)
randomly sample n D scores from poulation and calculate t based on sample (t= D-bar - muD/S d-bar)
repeat one million times and plot distribution