Threats To Psychometric Quality (Test Bias) Flashcards
Test Bias
- test bias arises due to factors inheretent in a test
-systematically prevents accurate and impartial measurement - implies systematic (not random) variation
- test bias compromises the validity of test score interpretation between groups
- systematically obscure differences (or lack of) among groups
- e.g., are there genuine differences between age groups, genders, ethnicities etc
Importance of Detecting Test Bias
Genuine Group Differences Vs Score Bias
- just because two groups differ in their mean test score does mot imply test bias
- but it is important to consider/check whether tests are unbiased
Types of Test Bias
- Construct Bias- bias in the meaning of a test
- i.e., internal and measurement bias
- occurs when a test has different meanings/interpretations for two groups - Predicitve Bias- bias in the use of a test
- i.e., external bias and prediction bias
- occurs when a test’s use has different implications for two or more groups
- construct and predictive bias are independent of one another
Construct Bias
- when a test has been interpreted differently by different groups it suggets that the responses/scores have different meanings for the groups
- e.g., responses for males may reflect a different construct than females
- it is concerned with issues in the relationship between a groups’ observed scores and their true scores
- construct bias can lead to situations whereby two groups may have the same true experiences (ability) but report different obserevd scores as they have interpreted the test differently
Predictive Bias
- when a test’s use has different implications for two or more groups
- relates to external bias in the use of a test (not the measurement itself)
- it concerns the relationship between scores (outcomes) on two different tests
-> does one score predict another score
-> is the score equally predictive for two different groups
Detecting Construct Bias
- tests for construct bias focus on a test’s interal structure
-> examine the internal structure of a test separately for two groups - the way the parts of the test are related to each other
-the pattern of correlation among items - correlations between each item and the total test score
- construct bias may be present if:
a) people in different groups respond to items in different ways
b) differing responses are related to group differences in the interpretation of the construct
Detecting Construct Bias
- no single method can be used to establish any bias
- 4 procedures for detecting construct bias:
1. item discrimination index
2. factor analysis
3. item functioning analysis
4. rank order consistency
Item Discrimination Index
- reflects the degree to which the item is related to the total test score
-> symbolised by D - based on the proportion of people answering an item correctly/highly
-> calcuated on each specific ite within a test - a high D (>.30) indicates:
❖ people who answer an item correctly tend to do better on the whole test
❖ the item strongly discriminates high and low scoring individuals
❖ the item is a good reflection of the construct - a test item with a low D suggets:
❖ the item os not a good refelction of the construct being assessed
➢ Item Discrimination Index used to assess construct bias:
❖ Item discrimination index is computed separately for both groups & compared
o i.e., Is each item answered in a similar way for each group?
❖ If the values are approximately equal then the item deemed to reflect the construct in the same way for both groups
❖ If the values are unequal, then construct biased may be present
as the item may discriminate individuals well in one group but not another
o (i.e., item needs revising or removing)
Factor Analysis
- evaluates the inetranl structure of a test (identifies items into “factors”)
- when items load strongly to a factor (>.40) they are beileved to reflect that factor appropriately
- factor analyses conducted on items separately for different groups
- if the same number of factors emerge for both groups = no construct bias
- if the factor numbers differ = a different internal structure fo each group; e.g., one group may have one factor but the other has two or three
- thus the test scores will reflect different constructs for each group
Item Functioning Analysis
- based on item response theory p’s true scores can be estimated by the probability they will answer an item correctly
- generate an item characteristic curve per item to compare across both groups should be uniform if there is no bias and a biased curve will be non-uniform
- analysis is quite complex and requires a large sample size (ensuring varied groups)
Rank Order Consistency
- calculate the means scores for each item separately across each group then the items are then ranked in order of difficulty for each two group and if the difficulty ranks differ across the groups, tehn construct bias may exist
- Spearman Rank Correlations are then calculated between the ranked item scores -> if the correlation between the item means is low (<.90), there may be construct bias
Detecting Predictive Bias
- predictive bias is the degree to which a test’s scores are equally predctive of an outcome for two groups
- tests for predictive bias focus on whether scores on a test predict a criterion (outcome) with equal accuarcy for two or more groups
What Information is Needed?
1. test scores
2. scores on a different measure
3. the general relationship between the two scores (for the whole sample)
4. the participant-group associations between the scores
Detecting Predictive Bias
- based on linear regression analyses (i.e., bivariate regression) it assumes there is a linear relationship between test scores (X) and outcome (Y)
-> a straight line can be used to predict the outcome score from the test score
-> if tests are not biased, then one regression equation will apply equally to all groups
Two Steps To Analysing Test Bias
- does the test actually help predict the dependent/ outcome variable?
- does the test predict the dependent/outcome equally well for across groups?
Linear/Bivariate Regression
Bivariate Regression Consists Of:
- at least one continous independent variable (X)
- a continuously measured dependent variable (Y)
Additional Terms:
- intercept: the expected value of Y when X is a value of 0
-> i.e., if the independent variable is 0 then what value is the dependent variable
- slope: the expected increase or decrease in the dependent variable (Y) if the independent variable (X) increases or decreases
- the intercept and slope of the equation help clarify the predicitive ability of the test
-> generate a regression line (“line of best fit”)
Detecting Predicitive Bias
- we want to assume all groups share a common regression equation
- so, we check the regression equation for each group separately
-> if non-biased = one regression equation applicable for all groups
-> similar to the common (overall) intercept and common slope
Four Types of Predicitve Bias
- Intercept Bias
- Slope Bias
- Intercept + Slope Bias
- Outcome Score Bias
Intercept Bias
- indicates there is a discrepancy between groups redicted scores
-> so one group is getting higher initial scores and the size of this discrepancy does not change as the test score changes - both groups may have slopes similar to the common regression equation
-> but their intercept values differ from the common intercept
-> the test does not appear to work the same way for both groups
Slope Bias
- groups have similar intercepts but their slopes differ in magnitude
- indicates that the different predicted scores for each group will be different when they have the same test score
Intercept & Slope Bias
- where intercept and slope bias co-occur simultaneously
-> the intercept and slope are different for both groups
-> a complex relationship between the size of predictor scores and outcome scores for different groups
-> this can result in many possible patterns