t tests Flashcards

1
Q

two ways to generate two gets of scores

A
  • repeated measures design
  • independent groups design
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

repeated measures design

A
  • every subject is exposed to each treatment condition and scores measured
  • comparison are between scores for the same individuals under different conditions
  • analyze using paired t test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

independent groups design

A
  • each subject is exposed to a single treatment condition and scores measured
  • comparisons are between scores from different individuals under different conditions
  • analyze using the independent t test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are some sources of variation between scores?

A

effects of treatment

individual differences
- differences in baseline score
- differences in responsiveness to treatment

effects associated with uncontrolled variables
measurement error

repeated measures eliminates variations due to individual differences!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

repeated-measures calculated difference (D) scores

A

Di = yBi — yAi

difference= (condition B) - (condition A)

this converts two sets of scores (conditions A and B) into a single set of scores (D)
- can compare and see if there’s variation between scores or variation between D scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

is there an effect of treatment?

A

Is there a difference between scores measured under the two conditions?

no effect of treatment, the sets of scores are identical
Di = yBi - yAi = 0

we wouldn’t expect all D scores to be exactly zero, but the mean of all D scores should approx to 0

null hypothesis: H0: muD=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how to test null hypothesis

A

determine the probability that observed sample mean would have been obtained from a population where muD=0
- p value is the probability of obtaining observed data, if H0 is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 approaches to test H0

A
  1. use sample data to obtain a sampling distribution (similar to DSM) with a mean of D-bar
    - determine location of muD=0 (H0) within this distribution
    - calc 95% CIs
  2. position the same sampling distribution with a mean of muD=0 (H0)
    - determine location of observed D-bar within this distribution
    - hypothesis testing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

GLM for D scores

A

GLM is:
Di= D-bar + error

based on D scores, not raw scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

bootstrapping 95% Cl for muD

A
  1. use observed D scores to generate an infinite hypohtesis population of D scores
  2. randomly sample n=16 D scores (match original sample size) from pop and calculate D-bar
  3. repeat many time, generating new random sample and calculating D-bar
  4. generate a distribution of D-bar
    - instead of distribution of sample means (DSM), this is the distribution of mean differences (DMD
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

bootstrapping 95% Cl in R

A

boot.t.test ()
- calculates standard deviation of DMD (standard error)
- defines a precise p value for probability of obtaining observed data, if null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to test H0

A

DMD estimates the distribution of all sample means that would be obtained from a population that matched our sample

we can located 0 in the distribution and focus on difference between D-bar and 0

subtract value of D-bar from all values in distribution
- calculates difference between D-bar and 0 but now 0 is the mean of the DMD

If D-bar - muD is small == observed data are likely if H0 is true

If D-bar - muD is large == observed data are unlikely if H0 is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

standardizing D-bar - muD

A

convert scores to z scores

zyi= (yi-y-bar)/(Sy)

subtract the mean of the DMD (0) then divide by standard deviation of the DMD

standard deviation of DMD represents the average value of the D-bar - muD
- standard deviation seeks to calculate the average deviation score

standardized values tell us the average difference that we would expect between D-bar and mu=0 if H0 is true
- if the observed difference between D-bar - muD is twice as large as the average difference that would be expected due to sampling variation, if H0 is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

central limit theorem and t statistics

A

t= D-bar / (Sd-bar) = D-bar / (Sd/ sqrt n)

if t=2, the observed difference between D-bar and H0 is twice as large as the average difference expected due to sampling variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

t distribution

A

start with noramly-distributed pop of D scores with muD=0
- pop has normal distribution, assumption of CLT
- pop can have nay standard deviation as the t statistic standardizes the values of D-bar - mud based on the observed value of Sd

define sample size (n)

randomly sample n D scores from poulation and calculate t based on sample (t= D-bar - muD/S d-bar)

repeat one million times and plot distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does the t distribution represent ?

A

all values of t that would be expected based on the sample size if H0: muD=0, is true

17
Q

what does the shape of the t distribution depend on?

A

sample size!

when calculating t, Sd is being used to estimate the corresponding population parameter
- estimate is less accurate for samples with smaller n
- t statistics will therefore be less precise for smaller n, resulting in some estimates of t that are unusually large or small
- result is t distribution with smaller n have wider tails

18
Q

applying the CLT t statistic 95% CI

A

for 95% CI we modify the same t distribution so it has muD=D-bar and s= Sd-bar
- 95% CI defined by the boundaries of the central 95% of this distribution

95%CI = D-bar +/- (tcrit x Sd-bar)

19
Q

using R to find 95% CI

A
  1. find tcrit
    qt( p=0.025, df=15) — lower
    qt(p=0.975, df=15) — upper
  2. insert values into equation
    95% CI= D-bar +/- (tcrit x Sd-bar)
20
Q

paired test in R

A

t.test (study 1$b, study1$a, paired= TRUE)
t.test (study 2$b, study2$a, paired= TRUE)
- running a paired t test for differences between a and b

results:
t statistic, df, p=value, 95%CI interval, mean difference (D-bar)

21
Q

bootstrapping t test in R

A

boot.t.test( study1$b, study1$a, paired=TRUE, R=100000)
boot.t.test( study2$b, study2$a, paired=TRUE, R=100000)
- from MKinfer package

results:
p value, standard deviation, 95%CI interval

22
Q

independent groups design

A

compare two sets of scores generated through independent groups design
ex. x= grouping variable (CTRL, DRUG), y= all measured scores

23
Q

fitting GLM with independent groups designs

A

recode xi with 0 or 1
- CTRL= 0
- DRUG= 1

use the model that predicts the value of y:
y-hat= b0 + b1x1

GLM is a linear model
- b0 is intercept, b1 is slope

24
Q

hypothesis testing

A

CTRL referred to as 0 group with DRUG as 1 group
H0: no difference between groups
- no difference between the means of the populations
H0: u1-u0= 0
H1: u1-u0 =/ 0

25
Q

bootstrapping with independent t test

A

group 0 is expanded to infinite hypothetical pop
group 1 is expanded to infinite hypothetical pop

n0 scores are sampled at random from hyp pop 0
n1 scores are sampled at random from hyp pop 1

calc means of each sample, then calc difference between sample means

repeat many times then plot distribution of the difference between sample means (DDSM)

26
Q

CLT and independent t tests

A

NHST: mean of t distribution is 0
standard deviation of t distribution is the standard error

  • have to standardize the observed value of (y-bar1 - y-bar0) by diving by standard error
27
Q

paired vs independent t statistic equation

A

paired:
t= (observed difference between Dbar and H0)/ (average difference between D-bar and H0 expected due to sampling variation) = D-bar -muD/Sd-bar

independent
t= (observed difference between (y-bar1-ybar0) and H0)/ (average difference between (y-bar1-ybar0 )and H0 expected due to sampling variation) = (y-bar1-ybar0)- (mu1-mu0)/ (S(y-bar1-ybar0))

28
Q

s(ybar1-ybar0)

A

with paired t statistic, we had one set of scores (D)

with independent there’s variation in (y-bar1-ybar0) due to variation in both y1 and y0

variance sum law: if we sum multiple values of y, the variance of the sum will be the sum of the variances of each value of y
- variance of (y-bar1-ybar0) will be the sum of the variance of y1 and y0

29
Q

independent t statistic

A

t= (ybar1-ybar0) - (mu1-mu0)/ sqrt( (Sy1^2/n1) +(Sy0^2/n0))

30
Q

Welch’s t statistic

A

t=(y-bar1-ybar0) / sqrt( (Sy1^2/n1) +(Sy0^2/n0))

31
Q

pooled variance Sp^2

A

according to H0, there’s no difference between our two samples
- samples came from the population

if two samples came from same population, Sy1^2 and Sy0^2 will both estimate the same parameter of the population variance

generate a single sample static to estimate population variance

32
Q

pooled variance equation

A

s2p = Sum(y1- y-bar1)^2 + Sum(y- y-bar0)^2/(n1-1) + (n0-1)

33
Q

two sample vs welch’s t-test

A

two-sample t test is appropriate if samples have similar variance

welch’s t test is appropriate if sample have very difference variances
- if using Welch’s t test, the estimate of (sigma^2) will be inaccurate, so t statistic will be more prone to error
- to compensate the Welch’s t test uses t distributions with fewer df than two-sample t test to convert test statistic to a p value
- this makes it more difficult to achieve statistical significance, counteracting the increased type I error rate that would result from the increase potential error

34
Q

independent test in R

A

t.test( y~x, data= independent)
- (outcome ~ predictor) this is the formula
- default, R runs the Welch’s t test (doesn’t assume equal variances)

to run regular two group t, modify var.equal argument
- t.test( y~x, data= independent, var.equal=TRUE)

bootstrap
- boot.t.test (y~x, data= independent, R=100000)