Power And Effect Size Flashcards
With F-ratios that exceed F-critical we…
- reject the null hypothesis
- independent variable(s) influence(s) the dependent variable.
- Statistically significant effect.
•When a finding does not exceed alpha level (p <0.05) we…
- fail to reject the null hypothesis:
- Ho=all means are equal implies no evidence of an effect of the treatment
- No evidence of a statistical difference.
“no statistical difference” does not…
- prove the null hypothesis.
- We simply do not have evidence to reject it.
- A failure to find a significant effect does not necessarily mean the means are equal.
So it is difficult to have confidence in the null hypothesis:
Perhaps an effect exists, but our data is too noisy to demonstrate it.
Sometimes we will incorrectly fail to reject the null hypothesis –
- a type II error.
* There really is an effect but we did not find it
Statistical power is the probability of…
detecting a real effect
power is given by:
1-
where is the probability of making a type II error
•In other words, it is the probability of not making a type II error
Power is your ability to find a …
difference when a real difference exists.
The power of a study is determined by three factors:
- Alpha level.
- Sample size.
- Effect size=
- Association between DV and IV
- Separation of Means relative to error variance.
Power and alpha
By making alpha less strict, we can…
•increase power.
(e.g. p < 0.05 instead of 0.01)
However, we increase the chance of a Type I error.
Low N’s have very little…
Power
Power saturates with many…
Subjects
Power and sample size
One of the most useful aspects of power analysis is the estimation of the
sample size required for a particular study
•Too small an effect size and an effect may be missed
•Too large an effect size too expensive a study
Different formulae/tables for calculating sample size are required according to
Experimental design
Power and effect size
•As the separation between two means increases the power…
Also increases
Power and effect size
As the variability about a mean decreases power …
Also increases
Measures of effect size for ANOVA
•Measures of association= Eta-squared (2) R-squared (R2) Omega-squared (2) •Measures of difference= d f
Eta squared is the proportion of the total variance that…
Is attributed to an effect
ETA squared equation
n2 = SStreatment / SStotal
Partial eta-squared is the
proportion of the effect + error variance that is attributable to the effect
Partial eta-squared equation
N2p = SStreatment / SStreatment + SSerror
Measures of association
ETA squared and partial ETA squared are both kinds of
Measures of association of the sample
Measures of association- R squared
In general R2 is the proportion of…
variance explained by the model
- Each anova can be thought of as a regression-like model in which each IV and interaction between Ivs can be thought of as a predictor variable
- In general R2 is given by
R squared equation
R2 = SSmodel / SStotal
Measures of association
Omega squared is an estimate of the
dependent variable population variability accounted for by the independent variable.
Measures of difference -d
When there are only two groups d is the…
standardised difference between the two groups
Measures of difference - f
Cohen’s (1988) f for the one-way between groups analysis of variance can be calculated as follows
F= square root of w2 / 1-w2
It is an averaged standardised difference between the 3 or more levels of the IV (even though the above formula doesn’t look like that)
Measures of difference
Cohens f
Small
Medium
And later effects
Small effect - f=0.10; Medium effect - f=0.25; Large effect - f=0.40
What can simple power analysis program available on the web called GPower do?
This program can be used to calculate the sample size required for different effect sizes and specific levels of statistical power for a variety of different tests and designs.
There are two ways to decide what effect size is being aimed for:
- On the basis of previous research
- Meta-Analysis: Reviewing the previous literature and calculating the previously observed effect size (in the same and/or similar situations)
- On the basis of theoretical importance
- Deciding whether a small, medium or large effect is required.
The former strategy is preferable but the latter strategy may be the only available strategy.
Calculating f on the basis of previous research
•This example is based on a study by Foa, Rothbaum, Riggs, and Murdock (1991, Journal of Counseling and Clinical Psychology).
- The subjects were 48 trauma victims who were randomly assigned to one of four groups. The four groups were
- 1) Stress Inoculation Therapy (SIT) in which subjects were taught a variety of coping skills;
- 2) Prolonged Exposure (PE) in which subjects went over the traumatic event in their mind repeatedly for seven sessions;
- 3) Supportive Counseling (SC) which was a standard therapy control group
- 4) a Waiting List (WL) control.
- The dependent variable was PTSD Severity
What should we report?
- Practically any effect size measure is better than none particularly when there is a non-significant result
- SPSS provides some measures of effect size (though not f)
- Meta-analysis (e.g. the estimation of effect sizes over several trials) requires effect size measures
- Calculating sample sizes for future studies requires effect size information
Things to be avoided if possible
- “Canned” effect sizes
- The degree of measurement accuracy is ignored by using fixed estimates of effect size
- Retrospective justification
- Saying that a non-significant result means there is no effect because the power was high
- Saying that there is a non-significant result because the statistical power was low
What are canned effect sizes?
The degree of measurement accuracy is ignored by using fixed estimates of effect size
What is retrospective judgement?
Saying that a non-significant result means there is no effect because the power was high
•Saying that there is a non-significant result because the statistical power was low