reliability and validity Flashcards
internal validity
extent to which the effect is due to the IV and nothing else
needs to thus avoid confounding variables
threats to internal validity
history- normally takes a while to collect data- changes may be political, natural disaster, pandemic
sampling- relates to a method used (which creates most issues with bias)- random sampling should be best- pre-screening ppts
attrition/potential mortality- drop out rates- certain types of people may drop out
maturation- e.g. studying children- ppts change over time
testing and instruments- testing multiple times using same measures = order effects/practice effects
may need to design equivalent tasks
using different researchers can also have an impact
rosenthal/pygmalion effect
high expectations lead to high performance
ways of improving internal validity
- standardised procedures= the way u treat every ppts is the same e.g. if using a placebo it should look the same as the actual drug
- counterbalancing=reduce order/practice effects
- blinding= ppt or researcher or both blinded to what condition ppt is in
external validity
how our results generalise out of the context e.g. ecological validity- real life settings or population validity
WEIRD samples effect this
reliability
extent to which measures are reproducible or consistent over time
external reliability
test re test-correlations of one time and another time
inter-observer or inter-rater= extent to which researchers agree in their ratings of coding.
internal reliability
internal consistency of the test
e.g. scoring consistently on a personality questionaire
common measure this- cronbach alpha
why is validity and reliability important
credibility of research
scientific rigor
ensure findings don’t conflict with each other
true experiments and rtc validity
internal validity high
external validity low
correlation validity
internal validity low
external validity high
questionnaires validity
- face validity- extent to which a questionnaire is evidently measuring what it is supposed to
- content validity- do the items/content tap into what we are trying to measure
- criterion validity- is the questionnaire if tested do different groups of people perform differently
- construct validity- theory around the construct- how close are our interpretations of the construct actually exist and how do we actually measure this construct