Reliabilities and validities Flashcards
Why do we examine reliability and validity?
- to tell if the test is good, if it accurately measures the target concept of the researcher
Reliability (3pts)
- measures consistently
- repeatability or consistent of measurement
- at different points in time or across different circumstances)
X = T + E
what is this equation for and what does each component mean?
- theoretical standpoint, some observed score X for a trait has the 2 components
T: true score
E: error
=> increases the reliability bc when the variance of T accounts for a higher proportion of variance in the observed scores
Ways to test reliability
Te_t-r_test reliability
int_r-r_ter agreement
paralle_ fo_ms of rel_ability
int_rna_ cons_stent
These tests are done across:
test-retest: time
inter-rater agreement: consistent over ppl
parallel forms of reliability: theoretically-equivalent measurements
internal consistency: the different individual parts tend to give related and similar answers (eg on a scale)
Good split-half reliability vs poor test-retest reliability
Good split-half: the general pattern remains the same for Part A and Part B (of the test). The scores slightly vary
Poor test-retest:
Same measures at different time, but the scores change dramatically
What is internal consistency?
- consistency across all individual items that make up a measurement scale
- calculate a statistic eg Cronbach’s alpha (a) to check the internal consistency of a scale (at measuring the underlying construct)
Three types of int. consistency
- spl_t half reliability:
- o_d-ev_n reliability:
- Cr_nb_ch’s _lpha
split half:
- score based on half of the items (part A) correlated with score on the other half (part B)
odd-even:
- score based on even numbered items correlated w/ score on odd no. items
Cronbach’s alpha:
- measure of reliability based on the AV. of all possible splits into 2 equal sets of items
Does a high value for alpha indicate unidimensionality (scale measures a single factor)?
Unidimensionality - degree to which the items all measure the same underlying construct
No
Int. consistency as the interrelatedness of a set of items
What assumptions does Cronbach’s alpha make?
- no residual correlations (uncorrelated errors)
- items have identical weighing (so each item/score on a scale contributes equally to the total)
- scale is unidimensional measures single construct
What do these values mean (Cronbach’s alpha)
0.70 to 0.80
0.85 to 0.95
0.95 or above
- acceptable and good reliability (respectively)
- 20% of observed score could be error
- may be too much redundancy in the items (measuring too much of the same thing)
validity
F_ce
C_nte_t
Cr_t_ri_n-related
- concurrent
- predictive
Con_tr_ct
- convergent
- discriminant
Face: (a basic measure of validity) validity of test at face value => does it appear to measure the target variable? is it self evident?
Content: different items on the test represent the entire and representative range of possible items the test should cover
Criterion-related: does the measure effectively PREDICT key outcome criteria?
- administer two tests and see how the results compare (concurrent)
- how the results can predict a future outcome (predictive)
Construct: extent measures a theoretical construct
- theoretically related (convergent)
- not related to what it should not be related to (discriminant)