WEEK 3: Hypothesis Testing and Experimental Design Flashcards
Learning Objectives:
- Understand and be able to explain how sampling error affects statistical analyses
- Understand and be able to explain the difference between the null and experimental hypotheses
- Understand and be able to explain what the p value represents
- Understand and be able to explain the null hypothesis and the relationship between the null hypothesis and the p value.
Key definitions:
Standard error (estimated amount of ‘deviation’ between the population and the sample) that we looked at last lecture is one kind of error, this week we are looking at sampling error
Sampling error - A sample that does not represent the population i.e. not sampling from the right population
Null hypothesis - Hypothesis that there is no effect of the IV on the DV, we are testing if this is true
P value - the probability that the null hypothesis is true
Sampling and Sampling Error (worked example)
With a normally distributed population, the mean scores for the samples taken from that population should also be normally distributed.
So we take lots of samples from the population and their means should be normally distributed.
We now take just 2 of those samples we did, say theyre completing a puzzle and we’re measuring the time it takes, and we give one sample caffeine and the other a placebo. We would expect the caffeine group to take less time to complete the puzzle.
HOWEVER, now imagine that within our population of students there is a group of students who practice doing puzzles. We could end up with samples from 2 different populations (students, and students who practice puzzles), this puts us in danger of making a type 1 error - where we are saying there is an effect when in fact there is no effect.
So now lets say that you’ve selected from your intended populations and you are comparing the speed of puzzle solving for a population who practice and a population who do not practice. The samples that come from the undergrad students show an effect, but because they’re already practiced the guys from the population that practice puzzles don’t show an effect.
In this case if we had accidentally sampled from the population where there is no effect we could make a type II error and think that there isn’t an effect but if we sampled from our intended population here would be an effect
What if there were only a couple of people in the sample that practiced puzzles? How would this effect results?
Depending on the size of the sample it probably wouldn’t effect the results considerably, data would still likely be normally distributed.
What is a type I error?
Type I error (α) - falsely rejecting null hypothesis, think there is an effect when there isn’t one
What is a type II error?
Type II error (β) - falsely accepting the null hypothesis, think there isn’t an effect when there is one
P value and the Null Hypothesis
The null hypothesis states that there is no effect of the IV on the DV, that there will be no difference between the two groups/ conditions.
The p value answers the question - what is the probability that the null hypothesis is true?
p = the probability that a systematic effect is incorrectly found in a population where the null hypothesis is true… i.e. what is the chance you got these results by fluke
We always want the p value to be less than .05, so there is a less than .05% chance the results were gained by fluke - a significant outcome can then be generalised to a wider population
Probability values
–> The highest a probability value can go is 1
This would mean that the difference you found between your samples would be found 100% of the time even if the null hypothesis is true.
In other words any effects are the result of chance or sampling error.
–> Probability values can be as low as 0
This would mean that the difference you found between your samples would be observed by chance less than .1% of the time
In other words it is likely there is an effect and the null hypothesis is highly unlikely
Null hypothesis significance testing
Rejecting the null hypothesis p .05
The probability that the null hypothesis is true is greater than .05 and it is likely that there is no effect in the population
Sampling, The p value & the standard error
Sampling error can occur if you sample from an unintended population
The standard error assess the estimated error resulting from using the sampling method
The p value assesses the probability that the null hypothesis is true (and the effect will not be found in the population)
Significance criterion
α, the risk of making a type I error, is usually set at .05 (significance level), or 5%.
The limitations of statistical significance testing…
- Statistical probability cannot be used as a measure of the magnitude of the result as this may reflect either the effect size or the sample size…
…two studies may conducted in the same way may produce very different results, in terms of statistical significance, simply because they have different sample sizes.
Significance simply tells you the chance you got the results by fluke, not how big the effect was of one variable on another, doesn’t reflect the magnitude of the result
e.g.
Small effect size, small sample - non significant result
small effect size, large sample - significant result
The larger the sample, the less likely it is that the results obtains were gotten by chance, despite the small effect size.
Too few participants can result in a type II error
If the p value is high….
The null hypothesis is true, so there is no effect in the population
could be because sampled from wrong population,
If the p value is 1
The probability that the null hypothesis is true is 100%, so there is no effect in the population and 100% of the time we will see this difference if the null hypothesis is true
whatever the outcome we see it is gained by pure chance
If the p value is .001
The probability that the null hypothesis is true is .001, so less than .1% of the time we will see this difference if the null hypothesis is true
Experimental Design
> Sampling
- Sample from the intended population
- Make sure the sample is large enough (power analysis)
> Hypothesis
- Be aware that you are testing the null hypothesis but write an experimental hypothesis
Reporting P values
- Use the established cut off of less than .05 (< .05)
- Report p values to 3 decimal places
- Consider the limitations of p values and report Confidence Intervals
…Some psychologists would suggest we scrap the p value in favour of the CI (see p.246 D & R)