t tests Flashcards
two ways to generate two gets of scores
- repeated measures design
- independent groups design
repeated measures design
- every subject is exposed to each treatment condition and scores measured
- comparison are between scores for the same individuals under different conditions
- analyze using paired t test
independent groups design
- each subject is exposed to a single treatment condition and scores measured
- comparisons are between scores from different individuals under different conditions
- analyze using the independent t test
what are some sources of variation between scores?
effects of treatment
individual differences
- differences in baseline score
- differences in responsiveness to treatment
effects associated with uncontrolled variables
measurement error
repeated measures eliminates variations due to individual differences!
repeated-measures calculated difference (D) scores
Di = yBi — yAi
difference= (condition B) - (condition A)
this converts two sets of scores (conditions A and B) into a single set of scores (D)
- can compare and see if there’s variation between scores or variation between D scores
is there an effect of treatment?
Is there a difference between scores measured under the two conditions?
no effect of treatment, the sets of scores are identical
Di = yBi - yAi = 0
we wouldn’t expect all D scores to be exactly zero, but the mean of all D scores should approx to 0
null hypothesis: H0: muD=0
how to test null hypothesis
determine the probability that observed sample mean would have been obtained from a population where muD=0
- p value is the probability of obtaining observed data, if H0 is true
2 approaches to test H0
- use sample data to obtain a sampling distribution (similar to DSM) with a mean of D-bar
- determine location of muD=0 (H0) within this distribution
- calc 95% CIs - position the same sampling distribution with a mean of muD=0 (H0)
- determine location of observed D-bar within this distribution
- hypothesis testing
GLM for D scores
GLM is:
Di= D-bar + error
based on D scores, not raw scores
bootstrapping 95% Cl for muD
- use observed D scores to generate an infinite hypohtesis population of D scores
- randomly sample n=16 D scores (match original sample size) from pop and calculate D-bar
- repeat many time, generating new random sample and calculating D-bar
- generate a distribution of D-bar
- instead of distribution of sample means (DSM), this is the distribution of mean differences (DMD
bootstrapping 95% Cl in R
boot.t.test ()
- calculates standard deviation of DMD (standard error)
- defines a precise p value for probability of obtaining observed data, if null hypothesis is true
how to test H0
DMD estimates the distribution of all sample means that would be obtained from a population that matched our sample
we can located 0 in the distribution and focus on difference between D-bar and 0
subtract value of D-bar from all values in distribution
- calculates difference between D-bar and 0 but now 0 is the mean of the DMD
If D-bar - muD is small == observed data are likely if H0 is true
If D-bar - muD is large == observed data are unlikely if H0 is true
standardizing D-bar - muD
convert scores to z scores
zyi= (yi-y-bar)/(Sy)
subtract the mean of the DMD (0) then divide by standard deviation of the DMD
standard deviation of DMD represents the average value of the D-bar - muD
- standard deviation seeks to calculate the average deviation score
standardized values tell us the average difference that we would expect between D-bar and mu=0 if H0 is true
- if the observed difference between D-bar - muD is twice as large as the average difference that would be expected due to sampling variation, if H0 is true
central limit theorem and t statistics
t= D-bar / (Sd-bar) = D-bar / (Sd/ sqrt n)
if t=2, the observed difference between D-bar and H0 is twice as large as the average difference expected due to sampling variation
t distribution
start with noramly-distributed pop of D scores with muD=0
- pop has normal distribution, assumption of CLT
- pop can have nay standard deviation as the t statistic standardizes the values of D-bar - mud based on the observed value of Sd
define sample size (n)
randomly sample n D scores from poulation and calculate t based on sample (t= D-bar - muD/S d-bar)
repeat one million times and plot distribution
what does the t distribution represent ?
all values of t that would be expected based on the sample size if H0: muD=0, is true
what does the shape of the t distribution depend on?
sample size!
when calculating t, Sd is being used to estimate the corresponding population parameter
- estimate is less accurate for samples with smaller n
- t statistics will therefore be less precise for smaller n, resulting in some estimates of t that are unusually large or small
- result is t distribution with smaller n have wider tails
applying the CLT t statistic 95% CI
for 95% CI we modify the same t distribution so it has muD=D-bar and s= Sd-bar
- 95% CI defined by the boundaries of the central 95% of this distribution
95%CI = D-bar +/- (tcrit x Sd-bar)
using R to find 95% CI
- find tcrit
qt( p=0.025, df=15) — lower
qt(p=0.975, df=15) — upper - insert values into equation
95% CI= D-bar +/- (tcrit x Sd-bar)
paired test in R
t.test (study 1$b, study1$a, paired= TRUE)
t.test (study 2$b, study2$a, paired= TRUE)
- running a paired t test for differences between a and b
results:
t statistic, df, p=value, 95%CI interval, mean difference (D-bar)
bootstrapping t test in R
boot.t.test( study1$b, study1$a, paired=TRUE, R=100000)
boot.t.test( study2$b, study2$a, paired=TRUE, R=100000)
- from MKinfer package
results:
p value, standard deviation, 95%CI interval
independent groups design
compare two sets of scores generated through independent groups design
ex. x= grouping variable (CTRL, DRUG), y= all measured scores
fitting GLM with independent groups designs
recode xi with 0 or 1
- CTRL= 0
- DRUG= 1
use the model that predicts the value of y:
y-hat= b0 + b1x1
GLM is a linear model
- b0 is intercept, b1 is slope
hypothesis testing
CTRL referred to as 0 group with DRUG as 1 group
H0: no difference between groups
- no difference between the means of the populations
H0: u1-u0= 0
H1: u1-u0 =/ 0
bootstrapping with independent t test
group 0 is expanded to infinite hypothetical pop
group 1 is expanded to infinite hypothetical pop
n0 scores are sampled at random from hyp pop 0
n1 scores are sampled at random from hyp pop 1
calc means of each sample, then calc difference between sample means
repeat many times then plot distribution of the difference between sample means (DDSM)
CLT and independent t tests
NHST: mean of t distribution is 0
standard deviation of t distribution is the standard error
- have to standardize the observed value of (y-bar1 - y-bar0) by diving by standard error
paired vs independent t statistic equation
paired:
t= (observed difference between Dbar and H0)/ (average difference between D-bar and H0 expected due to sampling variation) = D-bar -muD/Sd-bar
independent
t= (observed difference between (y-bar1-ybar0) and H0)/ (average difference between (y-bar1-ybar0 )and H0 expected due to sampling variation) = (y-bar1-ybar0)- (mu1-mu0)/ (S(y-bar1-ybar0))
s(ybar1-ybar0)
with paired t statistic, we had one set of scores (D)
with independent there’s variation in (y-bar1-ybar0) due to variation in both y1 and y0
variance sum law: if we sum multiple values of y, the variance of the sum will be the sum of the variances of each value of y
- variance of (y-bar1-ybar0) will be the sum of the variance of y1 and y0
independent t statistic
t= (ybar1-ybar0) - (mu1-mu0)/ sqrt( (Sy1^2/n1) +(Sy0^2/n0))
Welch’s t statistic
t=(y-bar1-ybar0) / sqrt( (Sy1^2/n1) +(Sy0^2/n0))
pooled variance Sp^2
according to H0, there’s no difference between our two samples
- samples came from the population
if two samples came from same population, Sy1^2 and Sy0^2 will both estimate the same parameter of the population variance
generate a single sample static to estimate population variance
pooled variance equation
s2p = Sum(y1- y-bar1)^2 + Sum(y- y-bar0)^2/(n1-1) + (n0-1)
two sample vs welch’s t-test
two-sample t test is appropriate if samples have similar variance
welch’s t test is appropriate if sample have very difference variances
- if using Welch’s t test, the estimate of (sigma^2) will be inaccurate, so t statistic will be more prone to error
- to compensate the Welch’s t test uses t distributions with fewer df than two-sample t test to convert test statistic to a p value
- this makes it more difficult to achieve statistical significance, counteracting the increased type I error rate that would result from the increase potential error
independent test in R
t.test( y~x, data= independent)
- (outcome ~ predictor) this is the formula
- default, R runs the Welch’s t test (doesn’t assume equal variances)
to run regular two group t, modify var.equal argument
- t.test( y~x, data= independent, var.equal=TRUE)
bootstrap
- boot.t.test (y~x, data= independent, R=100000)