Week 9-Reliability and Validity Flashcards
What Makes a Good Experiment?
■ Reliability.
■ Sensitivity.
■ Validity.
What’s Reliability?
■ Consistency/dependability of a measure.
■ The ability of a measure to produce the same or similar results on repeated administrations.
■ Is it any good at measuring?
■ Internal reliability
■ External Reliability
What’s Internal Reliability?
■ The extent to which a measure is consistent
within itself.
■ Split-half reliability:
– Compares the results of one half of a test with the other half.
– Split the items into two groups and correlate
them.
– A participant scoring high on one half, should
score high on the other half.
But what if all the good questions were at the beginning, rubbish questions at the end?
■ May give a weaker correlation than is fair.
■ Any correlation might just be down to chance, depending on how the questionnaire was split.
What’s Cronbach’s alpha?
■ A test that splits items equally in every way
possible.
■ It then correlates ALL halves with ALL other halves
■ Much more robust measure of internal reliability.
■ Should have a high correlation coefficient >.70
■ SPSS also gives information about how each item
correlates with all other items.
■ And how alpha would change if an item was
removed.
■ The higher the alpha – the more reliable the questionnaire.
What’s External Reliability?
■ The extent to which a measure varies from one use to another.
■ Test-retest reliability:
– the stability of a test over time.
– A good test is consistently reliable.
– Administer the test now, then give the same test later to the same participants
– A good test will have a high correlation.
What’s inter-rater reliability (external reliability)
– Usually used in observational studies
– Degree to which different raters give consistent estimates of the same behaviour.
– Correlation to check reliability.
■ To improve inter-rater reliability:
– Clear categories/definitions.
– Training.
How can you improve reliability in general?
– Improve quality of items.
– Increase/decrease number of items.
– Increase sample size.
– Choose appropriate sample.
– Control conditions.
What’s sensitivity?
■ Detecting even a small effect of the IV on the DV.
– Large sample size
– Sizeable/varied effects
– Control variability
What’s the benefits of sensitivity?
■ Not too hard or too easy.
■ Wide range of scores.
■ E.g. The effect of high levels of alcohol on reading ability?
– IV = alcohol intake.
– DV?
– Ask participants if they found the words hard to
read - Yes/No.
– Time them reading the word list.
■ Properties of your sample:
– Facebook/instagram usage – students or
pensioners?
What’s Validity?
■ A test/measure.
■ A study/experiment.
■ Test validity - the ‘truthfulness’ of a measure in that it measures what it claims to.
– Content validity.
– Construct validity.
– Criterion validity.
What’s content validity?
– Face validity - whether the test appears to measure what it claims to.
– Purpose is obvious.
– Social desirability?
What’s construct validity?
The degree to which a test measures the construct/psychological concept at which it is aimed.
– Convergent validity
– The degree to which it correlates with other
measurements assessing the same construct.
– Divergent/Discriminant validity
– The degree to which it does not correlate with
other measurements assessing different concepts.
What’s the criterion validity?
Whether a test reflects a certain set of abilities i.e. the degree to which a measurement can accurately predict specific criterion variables.
– Concurrent validity:
– How well a test correlates with a previously
validated measure, given at the same time.
– Predictive validity:
– How well it predicts future performance.
The validity of a study/experiment:Types
– External validity
– Internal validity
– Ecological validity
What’s external validity?
– The extent to which the results of a study can be generalised to different populations, settings and conditions.
E.g. Which styles of leadership work best in
a hostile military situation?
Test different styles in military training war
games.
How like real combat are military training
exercises?
What are some aspects of high external validity?
– Extend to new people/situations.
– Measure the construct that it claims to measure.
– Use a representative sample.
– Replicate using a new group.
What is internal validity?
– When we can be confident that manipulating the IV affects the DV.
■ Main goal of experimental research?
■ Causation - 3 criteria:
– Show co-variation/correlation.
– Show time-order relationships.
– Eliminate all other possible causes.
■ Free of confounding = internally valid.
What are the 5 key threats to internal validity?
– Testing intact groups (i.e. non-random allocation).
– Order/practice/fatigue/transfer.
– Presence of extraneous variables.
– Unequal loss across groups.
– Expectancy effects/demand characteristics.
How do you test intact (natural) groups?
– Between-subjects design only.
– Cannabis and memory.
– Psychology vs. Philosophy students.
What are order/practice/fatigue/transfer effects?
Within-subjects design only.
– Order/practice effects:
■ Better scores in 2nd/3rd etc.
– Fatigue/boredom effects:
■ Poorer scores in 2nd/3rd etc.
■ Overcome by counter-balancing.
What are differential transfer effects?
■ The effects of one condition affect participants’ performance in subsequent conditions.
E.g. Mnemonic learning techniques.
E.g. Mogg et al. (2008).
– Overcome by using a between-subjects design or a within-subjects design with each condition sufficiently spaced.
What are extraneous variables?
– May cause variation in scores but are NOT part of the experimental manipulation.
Which experimenter conducted the experiment.
Room temperature.
Trial order.
Assessment tools/techniques.
How do we deal with extraneous variables?
– Control everything you possibly can!
– E.g. Same experimenter, constant room temperature etc.
– Counter-balance everything.
– Standardised assessment tools/techniques.
What’s unequal participant loss across groups?
– Most experiments suffer from participant drop out.
– Only a problem in between-subjects designs.
What are the different types of expectancy effects?
■ Expectancy effects:
– Participant and/or experimenter.
■ Participant expectancy effects:
– E.g. Cannabis and memory.
– Placebo.
■ Experimenter expectancy effects:
E.g., Rosenthal and Fode (1963)
What are the participant and experimenter expectancy effects?
– Double-blind procedure - Neither know whether it’s drug or placebo.
– Expectancies of both held constant across conditions.
What are the demand characteristics?
– Cues/information used to guide behaviour
- what they believe the experimenter wants/ expects them to do.
– Rumours of the study.
– Setting.
– Explicit or implicit communication.
What can demand characteristics be overcome by?
– Deception.
– Double-blind procedures.
– Unobtrusive manipulations/measures.
– Between-subjects design.
What’s balance?
■ Experimental design = control = internal validity.
■ ‘Real world’ DV = external validity.
■ Ask participants to remember to do something at a particular point in the study as well as recall a list of words.
■ Multi-method approach:: Conduct a controlled
experiment and a naturalistic study.
■ Same conclusions - convergent validity.
What’s ecological validity?
– How much a method of research resembles ‘real life’.
– Are the results generalisable across different settings?
– Lab based = artificial.
– Representative sample.