midterm Flashcards
what is the p value
the significance level
- represents the portion of data sets that would yield a result as extreme or more extreme than the observed result if the null hypothesis is true
p <= ⍺ vs p > ⍺
p <= ⍺ : reject H0
p > ⍺ : retain H0
what does it mean when H0 is rejected
there is a statistically significant effect in the population
what does a confidence interval mean?
if we repeat our experiment a bunch of times, our results will fall in that interval a certain percentage of the time
ex. 95% confidence interval meals 95% of the time, our results will fall in the 95% interval
how do we form confidence intervals?
{(1-⍺)x 100}%
ex. if alpha=.05 = 95%Ci, .01=99%CI
what happens to CIs as sample size increases
our estimates become more precise and CIs become narrower
what happens to CIs as ⍺ decreases?
CIs become larger or wider
effect sizes for pearson, correlation ratio squared, cohens d:
small:
pearson (r): 0.10
correlation ratio squared R^2: 0.01
cohens d: 0.2
medium:
pearson (r): .30
R^2: 0.09
d: 0.5
large:
r: .50
R^2:0.25
d:0.8
Type I and Type II errors
Type I: reject H0 when it is true - false positive
Type II: retain H0 when it is false - false negative
what is ⍺ and β in errors in hypothesis testing
alpha: probability of committing type I error
beta: probability of committing type II error
what is power?
the probability of correctly rejecting a false H0
1 - β
alpha and beta relationship
higher alpha means lower beta - less conservative test power to reject null is higher
buuuut also higher probability of false positive - good and bad
assumptions of a single mean t test
variable, X, is normally distrubuted
independence of observations
t stats follow t distribution - approaches normal distrubtions as sample size (df) get bigger
between subjects design
independent samples - each participant only goes through one of two conditions in an experiment
correlated samples
dependent subjects, paired samples, repeated measures - participants go through both conditions
independent samples t test assumptions
- dependent variable normally distributed
- standard deviations of both populations are the same - homogeneity of variance
- each subject is independent
what is s^2 under homogeneity of variance?
a pooled estimate of within group variance
Anova: levels
treatments - different values or categories of independent variable/factor
ex. instruction method - 1. in person, 2. online, 3 hybrid, etc
single factor (one way) designs
a single IV with two or more levels
- can be repeated measures or independent groups design
factorial designs
more than one independent variable with two or more levels
- multiply the number of levels in each factor with each other
ex. two factors: 1st has two levels, second has 3= 2x3
One way anova assumptions
DV distribution in normal within each group
variance of population distributions are equal for each group - homogeneity of variance
independence of observations
null and alternative for one way anova
H0= µ1=µ2=µ3
H1: not all µ’s are the same
familywise type I error rate
probability of making at least one Type I error in the family of tests is the null hypotheses are true
= 1-((1-alpha)^c)
what do we do if the overall F test is significant in one way Anova
we recommend post hoc tests
MSM
one of the two sources of variance in anova - variance explained by the model
means variance between groups that is due to the IV or different treatments/levels of a factor
MSR
variance within groups, or residual variance
within each group, there is some random variation in the scores for subjects
F stat
assessing the relative magnitude of variance explained by the model and residual variance
large F value means a greater difference between groups - MSM is large compared to MSR
F distributions tend to be right skewed
F tests and T tests when the number of groups is 2
F=t^2
notation:
i, g, k, Ng, Xig, Xbarg, Xbar
i= an observation, a score
g= a group - group1, group 2, etc
k=total groups
Ng= size of group g
Xig = observation i in group g
Xbarg= group mean for group g
Xbar=grand mean - across all groups and observations
n^2 vs w^2
n^2 is positively biased - overestimates amount of variance in DV that can be explained by IVs
- w^2 is unbiased
effect sizes for w^2
small = 0.01
medium = 0.06
large = 0.14
APA format
- decimal places
- 7 things to report
2 decimal places, or 3 for p vals
should have
: F stat with df, statistic, p value and effect size measures
means and SDs as well
write an APA report for an ANOVa test of fitness vs ego strength with results:
group 1 (low fit): M=4.40, SD=0.92
group 2(high fit): M=6.36, SD0.55
F (1,8) = 5.32
W^2= .61
to investigate whether level of fitness had an effect on ego strength, we conducted a one way between subjects ANOVA. This analysis revealed a significant effect of fitness on ego strength, F(1, 8)=5.32, P<.05, w^2=.61. Participants in the low fitness group (M=4.40, SD=0.92) had significantly lower ego strength than those in the high fitness group (M=6.36, SD=0.55). We conclude that having high as opposed to low fitness may increase ego strength.
when do the results of an independent samples t test and a between subjects ANOVA for two groups on the same data set disagree?
when they use a different alpha level
what needs to be included in an APA summary for an ANOVA test with more than 2 means
a recommendation for post hoc tests
“post hoc tests are needed to understand which pairs of means differ significantly”
what is the linear model of ANOVA
Yij=µ+⍺j+Eij
Yij = dependent variable
µ=grand mean of treatment populations
⍺j=treatment effect for group j - not alpha level!!
Eij=experimental error - part that allows individual scores to vary (µ+⍺j is constant for every score in a population)
what are the 5 assumptions in ANOVA
- independence
one participant has no effect on another participant’s performance - this assumption can be violated when participants replicate each other’s responses and when several subjects are sampled from the same class - identical distribution(within group) -we assume we don’t know more about any one participants score than we do about others
- Identical distribution (between groups) - groups differ only in their means
- homogeneity of variance - variance of random variable, Eij, is the same for all groups
- normal distribution - random variable, Eij, has a normal distribution in all groups
what happens when the assumption of independence is violated?
- underestimation of true variability = increased type I error rate
what happens when the assumption of identical distribution within groups is violated?
- mean may not be an accurate representation of the population of interest - biased results (bc participants in the group may belong to different sub populations)
- error term (MSR) is inflated and the power of the test is reduced
what could cause the homogeneity of variance assumption to be violated?
- groups defined by classification factor (athletes vs non athletes) are part of the same sample
- experimental manipulations
what happens when the homogeneity of variance assumption is violated?
excessive type I error rates