W11 - Reliability Flashcards
What is the difference between psychological research and psychological assessment
Psychological Research:
Generalisations about representative samples of people.
Psychological Assessment:
Generalisations about specific individuals (n = 1)
What are some psychological assessment standards
- Nature of underlying construct(s)
- Basic psychometric principles and procedures: Requirements and Limitations
- Directions for administration and properties: Purpose, relevant standard errors, reliabality and validity
What is a valid test
A test is valid if it accurately measures what it purports to measure
What is a reliable test
Property of consistency in measurement.
Reliability and Validity. Necessary and Sufficiency.
Reliability is a necessary, but insufficient, requirement for validity.
(i.e. valid test cannot be unreliable, but reliable test may not valid)
Is reliability a binary decision?
No. Reliability is continous, not categorical (Reliable/Not Reliable)
What is the first equation of Classical Test Theory
Xi = T + Ei
Xi = Observed score on test occassion i
T = True Score
Ei = Error on test occassion i, Unsystematic variance.
What are the two properties of errors in classical test theory
- Endogenous: Factors about test-taker (client’s condition)
- Exogenous: Factors outside test-taker (psychologist measurement)
What are the assumptions of Classical Test Theory
- Expected value of error = 0
- Errors do not correlate with one another
- Errors do not correlate with true scores
- Expected value of test = True Score (On repeated administration of test, on average people will score their true score)
Elaborate on first assumption of classical test theory
- Expected value of error is zero
When all the errors on different test occassion adds up, it will be equal to 0
Elaborate on the second assumption of classical test theory
- Errors do not correlate with one another
Errors on Testi does not affect error on Testj
Elaborate on the third assumption of classical test theory
- Errors do not correlate with true score
r (te) = 0. Positive/Negative error is unrelated to true score
Elaborate on the fourth assumption of classical test theory
- Expected value of test equals to true score
Average of all observed scores = True Score
What is the second equation of Classical Test Theory
𝜎2x = 𝜎2t + 𝜎2𝜖 + 2cov(t,𝜖)
𝜎2x : Variance of observed scores
𝜎2t : Variance of true scores
𝜎2𝜖: Variance of error scores
2𝐶𝑜𝑣(𝜏,𝜖): Covariance between true scores and error scores, which is 0 under assumption (3)
What is the third equation of classical test theory, relating to how reliability is calculated
- 𝑅𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦
- 𝜌2𝑥𝜏
- (𝜎𝜏2/𝜎𝑥2)
- (𝜎𝜏2)/ (𝜎𝜏2+𝜎𝜖2)
- (𝑆𝑖𝑔𝑛𝑎𝑙)/(𝑆𝑖𝑔𝑛𝑎𝑙+𝑁𝑜𝑖𝑠𝑒)
𝜌2𝑥𝜏: Theoretical reliability coefficient
𝜎𝜏2 : Variance True Scores
𝜎𝑥2 : Varaince Observed Scores
𝜎𝜖2 : Variance Error Scores
Persepctives on reliability: Conceptual and statistical basis
Conceptual: Observed score in relation to:
(a) True Score; (b) Measurement error
Statistical; (a) Proportion of variance (b) Correlations
Perspective 1: Using true score and proportion of variance
Ratio of true score variance to observed score variance (Same as second equation)
rxx = St2/ Sx2
Perspective 2: Using measurement error and proportion of variance
Reliability is the lack of error variance
rxx = 1 - (S2𝜖/ S2x)
Perspective 3: Using true score and correlations
Reliability is the squared correlation between
observed scores and true scores
rxx = r2xt
Perspective 4: Using measurement error and correlation
Reliability is the lack of correlation between
observed scores and error scores
rxx = 1 - r2x𝜖
Reliability in practice, since we do not know the true score varaince, what are some assumptions we add on in a parallel test.
We run parallel test: This test must have:
(1) Tau equivalent. True score on both test is the same
(2) Same level of error variance
What are the 3 ways of testing reliability
- ) Test-retest
- ) Parallel-form reliability
- ) Split-half reliability
All three assumes that they are parallel forms of the test
What is test-retest reliablity
Correlation between original test and retest (Same test, different time)
What is the use of test-retest reliability
Useful for stable traits, not useful for transient states
What are the cons of test-retest relability
- Carryover Effects (Smaller gap between test and re-test)
- Googling Answers
- Bored
- Remember test items
- True score may vary
- Participants may fail to return (Bigger gap between test and re-test)
Trade off between paticipants failing to return and carry over effects
What is parallel-form reliability
Correlation between two parallel forms of the test
Parallel form reliability: What must be ensured?
- Parallel form must measure same set of true scores
- Parallel form must have equal vairance as original form
What are the pros of parallel form reliablity
It can be used on the same day
What are the cons of parallel form reliablity
- Might not truly be parallel
- Affects true score
- Carryover effects
- Even though there is no direct memory effects from orginal test, they might still learn stuff
What is split-half reliability
Correlations between 2 sub-tests split from 1 test
What is the pros of split-half reliability
Only one test. Easy
What are the cons of split-half reliability
- Might not truly be parallel
- Deflation of reliability estimate as subtests have only half the items compared to main test
What is cronbach’s alpha
Means of all possible split-half reliabilites, scaled up to a full test instead of a half-test
Is cronbach’s alpha legit. Why?
Not really. Provides a conservative, lower-bound estimate for reliability and recent study suggest it’s of limited use.
What does the reliability coefficient (rxx) fail to do?
It does not tell us in test score units how much measurement error is ‘typical’ as it is not expressed in test units.
What is standard error of measurenet (SEm)
Average error score (i.e. SD of erorrs)
What is the formula for SEm
SEm = sx [√(1 - rxx)]
sx = standard deviation of observed scores
rxx = reliability coefficient
If the test is completely unreliable, what is the standard error of measurement
SEm = sx [√(1 - rxx)]
If rxx = 0
Hence, SEe = Sx
Standard error of measurement = Standard deviation of observed scores
If the test is completely reliable, what is the standard error of measurement
SEm = sx [√(1 - rxx)]
Since rxx = 1,
Hence, SEe = 0
No standard errors of measurement
What is the direction of association between reliability and SEm as a proportion of SD
Negative. As reliability increases, SEm decreases
According to Nunnally, what is the reliablility needed?
Bare minumum = 0.9
Desirable = 0.95
What is the equation to predict a client’s true scores
T_hat = (rxx)(x) + (1 - rxx)(𝜇T)
- T_hat = predicted true score
- rxx = reliability
- x = observed score
- 𝜇T = population mean for the test (e.g. IQ = 100)
What if the predicted true score if the test was completely unreliable
T_hat = (rxx)(x) + (1 - rxx)(𝜇T)
If rxx = 0,
T_hat = 𝜇T (population mean)
What if the predicted true score if the test was completely reliable
T_hat = (rxx)(x) + (1 - rxx)(𝜇T)
If rxx = 1,
T_hat = x (that is, the observed scores)
What is the direction of association between reliability and predicted true scores (T_hat)
As reliability increases, T_hat moves closer to observed scores.
As reliability decreases, T_hat regress towards the population mean.
What is true score confidence intervals built upon
Standard error of estimation
What is the equation for standard error of estimation
SEe = Sx [√rxx(1 - rxx )]
- Similar to standard error of measurement (SEm). but with an extra rxx.
- Note: Sx = Standard Deviation of Test
What defines the 95% CI for predicted true scores
Lower Bound: [T_hat - (1.96 x SEe)]
Upper Bound: [T_hat + (1.96 x SEe)]
What are the correlations between measures compared to correlations between constucts. Why?
-
Observed correlations between two measures x and y is always lower than true correlation between underlying constructs
- Because observed correlations is attenuated/reduced by measurement error
What does the disattenuation formula aim to
- Estimates the correlation if 2 constructs were not affected by measurement error
- Corrects for the fact that measurement error attenuates the correlation between 2 constructs measured
What is the maximum correlation between 2 measures x and y
Max rxy = √rxxryy
- rxx
- Reliability of test x
- rxy
- Reliability of test y
- rxy
- Observed correlation of underlying constructs x and y
What is the disattenuation formula
r’xy = rxy / √rxxryy
- r’xy
- Correlation between 2 constructs without measurement error
- rxy
- Obseved correlation between 2 constructs
- rxx
- Reliability of construct x
- ryy
- Reliability of construct y
How to increase the correlation between test scores and constructs
- Increase the relationship between construct and test
- Quality of Items
- Remove inconsistency in test administration and interpretation
- Reduce measurement error (Exogenous)
- Increase no. of test items
- Quantity of items
What is the Spearman-Brown Prophecy Formula
r’xx = (nrxx) / [1 + (n-1)(rxx)]
- r’xx
- Reliability of expanded test
- rxx
- Reliability of original test
- n
- Expansion factor (e.g. n = 2 means double items)
What is the relationship between expansion factor and reliability of expanded test. What is the caveat?
- Negative acceleration
- Which has practical benefit
- New items must be as good. If not, reliability of expanded test (r’xx ) might be even wprse
If we know what is our desired reliability, what is the forumla to work out the expansion factor
n = (r’xx)(1 - rxx) / (rxx)(1-r’xx)
r’xx = Reliability of expanded test
rxx = Reliability of original test