Bivariable Associations Flashcards
if p ≤ .05 then we…
REJECT NULL
if REJECT p > .05 then we…
FAIL TO REJECT NULL
What are the different bivariable statistical methods?
– T-test (independent samples t-test) and
ANOVA
– Chi-square test of independence
– Correlation
What test do you use if you have a categorical variable and a continuous variable? (less than 3 categories)
T-test
What test do you use if you have a categorical variable and a continuous variable? (more than 3 categories)
ANOVA
What test do you use if you have two categorical variables?
Chi-square test of independence
What test do you use if you have two continuous variables?
Correlation
What is a critical value
the value associated with a particular
significance level
Remember
Z tests use the z distribution
– T tests use the t distribution
– ANOVA uses the F distribution
– Chi-square test of independence uses the Χ2 distribution.
List each statistic for the test of association
– t statistic (t-test)
– F statistic (ANOVA)
– Χ2 statistic (chi-square test of independence)
What is a t test
A statistical procedure that allows us to test whether the data from the two groups are the same or different.
What is the null hypothesis for t tests. Put equation and words
–H0: μ1 = μ2
μ1 – μ2 = 0
–Mean of the data from sample 1 = mean of the data from sample 2
–Rejecting the null hypothesis implies that the means of the sample are
statistically significantly different
• Failure to reject the null hypothesis implies that the two means of the
sample are statistically insignificantly different (the same).
• And thus we assume this is true for the populations.
what is the t test formula
For numerator: [(x1 – x2) – (x bar1 – x bar2)]
For denominator: √[s2 pooled[(1 / n1) + (1 / n2)] ]
Two assumptions of the t test
•Both parent populations are normally distributed
• Both populations have equal variance
(homogeneity of variance)
How do you calculate homogeneity of variance
– Levene’s test for equality of variances:
• Null hypothesis: variance1 = variance2
• p ≤ 0.05 implies that the variances are different
• p > 0.05 implies that the variances are the same
(– Violations of homogeneity of variance may be ignored as long as the samples being used have equal or
approximately equal sizes)
How do you calculate degrees of freedom for t test (t-test degrees of freedom POOLED)
Degrees of freedom (df) = (n1+n2-2)
How do you calculate t-test degrees of freedom (unequal df’s)
df conservative = the smaller of df1 and df2
df1 = (n1 – 1) and df2 = (n2 – 1)
When should you use ANOVA
Used with 3 or more groups to test for MEAN DIFFS
What is the null hypothesis for ANOVA
H0: μ1=μ2=μ3
What is the alternative hypothesis for ANOVA
H1: μ1”≠μ2”≠μ3
What is the null hypothesis for ANOVA in words
All populations have the same mean
What is the alt hypothesis for ANOVA in words
Not all populations have the same mean
What is the statistic for ANOVA
f stat
If F > Fcrit then we…
reject the null hypothesis, meaning that there is some significant difference across all means
IF F < Fcrit then we…
fail to reject the null hypothesis bc the means are equivalent
What is the summary of procedures for testing
bivariable associations
Step 1: determine what kind of variables you have
• Step 2: determine which statistical procedure to use based
on the two variables
• Step 3: compute the appropriate test statistic
• Step 4: determine if the test statistic exceeds the critical
value (the value that corresponds to p ≤ 0.05)
– If it exceeds the critical value: REJECT the null hypothesis
– If it does not exceed the critical value: FAIL TO REJECT the null
hypothesis
Purpose of ANOVA (f crit)
ANOVA determines F statistic which we test for
significance (Is F > Fcrit at p =.05 level?)
Note about leven’s test for equality of varianes
We use Levene’s test for equality of variances—If the p-value associated with Levene’s test is less than alpha, the homogeneity of variance assumption is violated. If p > α, the homogeneity of variance assumption is met.
What is the ANOVA f statistic
A ratio of the Between Group Variation divided by
the Within Group Variation: F=between/within=MSB/MSW
*A large F is evidence against H0, since it indicates that there is more difference between groups than within groups.
How do you calculate F
Mean Square Between / Mean Square Within
= MSB / MSW
For f crit, we need to know two things:
-The # of groups
– The total N
How do we find f crit
WE LOOK ACROSS TOP OF TABLE DF NUMERATOR
AND ALONG SIDE DF DENOMINATOR
How to calculate degrees of freedom for ANOVA–there are two the D.O.F. for numerator and the one for denominator
#groups-1 (for numerator) N(pop.)-K N= total sample size (how many people are in the study) F(dfn, dfd) Ex: Example N= 45; 4 groups 45 – 4 = 41 We look at DF (3, 41)
Which test do you use if you have two continuous variables
Correlation: Pearson Product Moment Correlation
Steps to complete t-test:
– State the null hypothesis (H0: μ1 = μ2)
– Assess Levene’s test (H0: variance1 = variance2)
– Compute appropriate t statistic and df
• Fail to reject levene’s -> equal variance -> pooled method
– df = n1 + n2 – 2
• Reject levene’s -> unequal variance -> unequal variance method
– dfconservative = the smaller of df1 and df2
– Compare t-test statistic to critical value
• If t-test stat > t crit -> reject the null
• If t-test stat < t crit -> fail to reject the null
Steps to complete ANOVA test:
Compute F statistic and use F distribution
• df
– Numerator: Groups-1
– Denominator: N – (# Groups)
– At the given df at at p < .05
» if F > Fcrit, then we reject null hypothesis
What does correlation test measure
– Strength of the relationship: strong, moderate, weak, or no
relationship
– Direction of the relationship: positive (+) or negative (-)
(a measure of the linear association between two variables, X and Y)
What does -1, 0, and 1 represent for correlation test
+ 1 is total positive correlation; all the data points fall on a
line with positive slope.
– 0 is no correlation,
– −1 is total negative correlation ; all the data points fall on a line with a negative slope.
What does it mean if r is greater than or equal to .7, is greater than or equal to .3 but is less than .7, and is less than .3
– |r| ≥ 0.7 indicates a strong association
– 0.3 ≤ |r| < 0.7 indicates a moderate association
– |r| < 0.3 indicates a weak association
What is the symbol of Pearson Product Moment Correlation Coefficient
r
What is the null hypothesis for PPM correlation test? Equation and words
Ho :ρ= 0
HA :ρ≠ 0
– If we reject we are saying that r is not like 0 and thus there
is a real association between the variables
– If we fail to reject we are saying that r is like 0 and thus
there is not a real association between the variables
The significance of the correlation (r) is tested using a t- statistic:
tstat = r/SEr
How do you calculate D.O.F. for PPM
n-2 (total sample size)
If tstat ≥ tcrit at p < .05 and df=n-2, then we reject the
null hypothesis then we…
REJECT NULL
If tstat < tcrit at p < .05 and df=n-2, then we fail to
reject the null hypothesis then we…
FAIL TO REJECT NULL
What are the two ways that we report PPM
r (df) = XXX p < .05 OR r (df) = XXX p > .05
Ex: (r (129) = .82, p < .05)
r = .82, df = 129 , p < .05 significant
When do you use chi square test of independence
when you are measuring two categorical variables
What are the parametric tests and what does parametric mean
t test, ANOVA, PPM
make assumptions about the shape or form of the probability distribution from which the data were drawn
What are non parametric tests and list an ex.
A family of tests that do not rely on assumptions about the shape or form of the probability distribution from which the data were drawn
Chi square
What is a contingency table
A two-way table showing the cross-tabulations between two variables where the variables have been classified into mutually exclusive categories and the cell entries are frequencies.
what is the symbol for expected frequencies and how do you determine this.
fe
-Add the columns and the rows to get the totals
as shown in previous slide.
-Multiply the row total and the column total for the cell in
question and then divide that product by the Total
number of all respondents.
What is the null hypothesis for chi square. equation and words.
the two variables are independent (not related).
How do you test null hypothesis for chi square
– Compute chi-square test statistic χ2
– Determine df
– Determine if χ2 > χ2 critical at p ≤ .05
What is alternative hypothesis for chi square
Ha: variables are dependent/related
How do you calculate D.OF. for chi square
(rows-1) * (columns -1)
How do you calculate chi square stat
(1) the sum over all cells of
(2) the difference between the observed value and the
expected value SQUARED, which is then
(3) divided by the expected frequency.
What is the ANOVA f test for and what is it’s null hypothesis, what is its equation
Used to test whether y is linearly related to x
• H0: β1=0 Ha: β1≠0
The null hypothesis states that y is not linearly related to x
• F = MSR/MSE
(explained variance / unexplained variance)
DFR =
number of covariates (for simple linear regression =1)
DFE=
DFT-DFR (DFT is n-1 because only one mean is calculated) = n-2 in simple linear regression