A2 reliability Flashcards
RELIABILITY
Reliability is a measure of consistency.
In general terms, if a particular measurement is made twice and produces the same result then that measurement is described as being reliable.
If a test or measure in psychology assessed some ‘thing’ on a particular day (e.g. intelligence), then we would expect the same result on a different day, unless the ‘thing’ itself had changed.
Maybe we tested a different person with a different IQ or the same person’s IQ went up a little.
Psychologists tend not to measure concrete things but are more interested in abstract concepts such as attitudes, aggression, memory and IQ.
WAYS OF ASSESSING RELIABILITY
Psychologists have devised ways of assessing whether their measuring tools are reliable.
TEST-RETEST
The most straightforward way of checking reliability is the test-retest method.
This simply involves administering the same test or questionnaire to the same person on different occasions.
If the test or questionnaire is reliable then the results obtained should be the same, or at least very similar, each time they are administered.
This method is most commonly used with questionnaires and psychological tests (such as IQ tests) but can also be applied to interviews.
There must be sufficient time between test and retest to ensure that the participant/respondent cannot recall their answers to the questions to a survey, but not so long that their attitudes, opinions, or abilities may have changed.
In the case of a questionnaire or test, the two sets of scores would be correlated to make sure they are similar.
If the correlation turns out to be significant (and positive) then the reliability of the measuring instrument is assumed to be good.
INTER-OBSERVER RELIABILITY
This issue is relevant to observational research as one observer’s interpretation of events may differ widely from someone else’s - introducing subjectivity, bias and unreliability into the data collection process.
The recommendation is that would-be observers should conduct their observations in teams of at least two.
However, inter-observer reliability must be established.
This may involve a small-scale trial run (a pilot study) of the observation in order to check that observers are applying behavioural categories in the same way, or a comparison may be reported at the end of a study.
Observers need to watch the same event, or sequence of events, but record their data independently.
As with the test-retest method, the data collected by the two observers should be correlated to assess its reliability.
Similar methods would apply to other forms of observation, such as content analysis (though this would be referred to as inter-rater reliability) as well as interviews if they are to be conducted by different people (known as inter-interviewer reliability)
MEASURING RELIABILITY
Reliability is measured using a correlational analysis. In test-retest and inter-observer reliability, the two sets of scores are correlated.
The correlation coefficient should exceed + 80 for reliability.
IMPROVING RELIABILITY
Questionnaires
* The reliability of questionnaires over time should be measured using the test-retest method. Comparing two sets of data should produce a correlation that exceeds + 80.
- A questionnaire that produces low test-retest reliability may require some of the items to be ‘deselected’ or rewritten.
- For example, if some questions are complex or ambiguous, they may be interpreted differently by the same person on different occasions. One solution might be to replace some of the open questions (where there may be more room for misinterpretation) with closed, fixed-choice alternatives which may be less ambiguous.
Interviews
* For interviews, the best way of ensuring reliability is to use the same interviewer each time.
* If this is not possible or practical, all interviewers must be properly trained so, for example, one particular interviewer is not asking questions that are too leading or ambiguous.
* This is more easily avoided in structured interviews where the interviewer’s behaviour is more controlled by the fixed questions.
* Interviews that are unstructured and more ‘free flowing’ are less likely to be reliable.
Observations
* The reliability of observations can be improved by making sure that behavioural categories have been properly operationalised, and that they are measurable and self-evident (for instance, the category ‘pushing’ is much less open to interpretation than ‘aggression’).
* Categories should not overlap (‘hugging’ and ‘cuddling’ for instance), and all possible behaviours should be covered on the checklist.
* If categories are not operationalised well, or are overlapping or absent, different observers have to make their own judgements of what to record where and may well end up with differing and inconsistent records.
* If reliability is low, then observers may need further training in using the behavioural categories and/or may wish to discuss their decisions with each other so they can apply their categories more consistently.
Experiments
* In an experiment it is the procedures that are the focus of reliability.
* In order to compare the performance of different participants (as well as comparing the results from different studies) the procedures must be the same (consistent) every time.
* Therefore, in terms of reliability an experimenter is concerned about standardised procedures.