12. Reliability & Validity Flashcards
What is reliability
- Reliability refers to how consistent or dependable a test is.
- A reliable test carried out in the same circumstances, on the same participants should always give the same results.
- There are different types of reliability
Different types of reliability
- Internal reliability
- External reliability
- Inter-observer reliability
What is Internal reliability
- Different parts of the test should give consistent results.
- For eg, if an IQ test contains sections of supposedly equal difficulty, P’s should achieve similar scores on all sections.
- Internal reliability of a test is assessed using the split-half method. This splits the test into 2 halves, (eg. odd & even numbered questions), & the results from each half should produce a high positive correlation.
What is External reliability
- The test should produce consistent results regardless of when its used.
- For eg, if you took the same IQ test on 2 diff days, you should achieve the same score.
- External reliability of a test is assessed using the test-retest method. This involves repeating the test using the same P’s. A reliable test should produce a high positive correlation between the 2 scores.
A problem w this is that the P’s may have changed in some way since the 1st test (eg. they may have learnt more). To avoid this, external reliability can be hacked using the equivalent forms test. This compares P’s scores on 2 different, but equivalent (equally hard), versions of the test.
What is Inter-observer reliability
- Test should give consistent results regardless of who administers it.
- For eg, if 2 researchers observe behaviour & categories infants as showing signs of a strong attachment or weak attachment, they should both record the same score.
- This can be assessed by correlating the scores that each researcher produces for each participant. A high positive correlation should be found.
What is validity
- Validity refers to how well a test measures what it claims to.
- For eg, an IQ test w only maths questions, would not be a valid measure of general intelligence
- There are different types of validity
Different types of validity
- Face validity
- Concurrent validity
- Ecological validity
- Temporal validity
What is Face validity
The extent to which the test looks, to the participants, like it will measure what it is supposed to be measuring.
What is Concurrent validity
The extent to which the test produces the same results as another established measure. Inferential tests can be used to determine whether both measures are highly correlated, & therefore valid.
What is Ecological validity
The extent to which the results of the test reflect real-life.
What is Temporal validity
The extent to which the test provides results that can be generalised across time.
How to assess validity
Validity can be assessed in different ways:
- A quick (but not very thorough) way of assessing validity is to simply look at the test & make a judgement on whether it appears to measure what it claims to.
- Comparing the results of the test w the results of an existing measure (that’s already accepted as valid) can help determine the validity of the test.
- The results of the test can be used to predict results of future tests. If the initial results correlate w the later results, it suggests that the test has some validity & can continue to be used.
How can Reliability & Validity be improved
- Standardising research
- Operationalising variables
Improving reliability & validity: What is Standardising research
- Involves creating specific procedures which are followed every time the test is carried out. This ensures that all the researchers will test all the P’s in exactly the same way (eg. same time of day/env/instructions).
- This reduces the possibility of extraneous variables affecting the research. Therefore, it will help to improve external reliability & inter-observer reliability.
Improving reliability & validity: What is Operationalising variables
- Involves clearly defining all the research variables.
- For eg, in a study of whether watching aggressive TV influences aggressive behaviour, the terms ‘aggressive TV’ & ‘aggressive behaviour’ need to be defined.
- ‘Aggressive TV’ could include cartoons or human actors. One of these might influence behaviour, & other might not - this needs to be taken into account when planning, conducting & drawing conclusions from investigation.
- Similarly, ‘aggressive behaviour’ could refer to physical & verbal aggression, or just physical aggression.
- Clarifying this from the start improves the reliability & validity of the test.