RELIABILITY AND VALIDITY Flashcards
Define reliability
the consistency or repeatability of your measurement
what are the three types of reliability
stability of the measure (test-retest)
internal consistency of the measure (split-half, cronbach;s alpha)
Agreement or consistency across raters (inter-rater reliability
what does test-retest reliability look at?
whether your test measures the the same thing every time you use it
same Q. given on two occasions and data correlate
test-retest - how do you address the stability of the measure?
- you administer the measure at one point in time (test)
- you then give the same measure to the same participant at a later point in time (retest)
- correlate the scores on the two measures
what are the two main problems with test-retest
Memory Effect -
participants may remember the experiment > will improve their second measure
- to short time between = greater risk of memory effects
Practice effect-
performance improve because of practice in test taking
- too long time between = risk of other variables (additional learning)
what does split half reliability look at?
whether your measure is internally consistent
Split Q in half and correlate data from two halves
split half reliability - how do you test whether your measure is internally consistent?
- administer a single measure at one time to a group of participants
- split the measure into two halves and correlate the scores
- higher correlation means greater reliability
e.g. 20 item, score one half (10 items) and second half (10 items), test correlation between the two halves
pros and cons of split- half reliability
PRO
eliminates memory/practice effects
CON
are the two halves really equivalent
two methods of assessing internal consistency
split-half method
cronbach’s alpha
what does cronbach’s alpha assess
internal consistency of your measure
tells you how well the items or questions in your measure appear to reflect the same underlying construct
good internal consistency = when individuals respond the same way
how is cronbach’s alpha measured
mathematically equivalent to average of all possible split-half reliabilities
coefficient alpha can range from 0-1 >closer to 1 = better reliability
what does inter-rater reliability look at?
whether different raters measure the same thing
checking the match between two or more raters or judges
e.g. coding videos for infants “looking time” - need to check agreement amongst the coders
how is inter-rater reliability calculated
nominal/ordinal scale
- the percentage of times different raters agree
interval or ratio scale
- correlation coefficient
define Validity
the credibility of the measure
are we measuring what we think we are?
why is validity an issue
many variables in social research cannot be directly observes
- motivation, satisfaction, helplessness
types of validity
face validity
content validity
criterion validity (concurrent, predictive)
construct validity (convergent, discriminant/divergent)
what is face validity
items appear to relate to construct
a weak, subjective method for assessing validity
a good first step to validity assessment
what is content validity
the extent to which the measure is representative of a sampling of relevant dimensions
does it cover all aspects of the construct that its meant to measure
how much does the measure cover the content of the definition?
what is criterion-related validity
checking the performance of your measure agains an external criterion
agree with external sources
what are the two types of criterion-related validity
concurrent
predictive
define concurrent criterion validity
a means in establishing validity of your measurement by comparing to a gold standard
>i.e. existing validated measure of the same construct
agrees with pre-existing “gold standard” measure
what is predictive criterion validity
assessing the validity of your measure against what you theoretically predict to happen
agrees with future behaviour
define construct validity
how well the measure and other constructs relate to each other (consistent with a theory)
what are the two types of construct validity
convergent
divergent
define divergent construct validity
assessing validity by comparing measures of construct that theoretically should not be related to each other and are observed to not relate to each other
> theoretically should not and in fact are not related <
i.e. you should be able to discriminate/diverge between dissimilar constructs
define convergent construct validity
assessing validity by comparing measures of constructs that theoretically should be related that are observed to relate to each other
> theoretically should and in fact are related <
i.e. there is correspondence or convergence between similar constructs