L9. Test bias Flashcards

1
Q

What is test bias?

A
  • when a factor inherent in the test systematically prevents accurate impartial measurement
  • Systematic -implies not just by chance / not random variation
  • Favouring one group > another
  • Important test scores do not discriminate against a particular group (gender/sex and/ or race)
  • No single method to identify, want to see a pattern of results for evidence
  • Enables to establish the R and V of test scores
  • two forms of test bias: construct and predictive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

contruct bias

A
  • bias in the MEANING of the test based on the constructs it measures
  • when a test has different meanings for the two groups
  • If systematically diff for diff groups, could conclude a bias = scores have different meanings for different groups
  • Can lead to two groups having same avg true scores but diff avg observed scores
  • Also look for this irrespective of if there is a diff between these scores or
  • If have construct bias will probably have a predictive bias
  • 4 internal methods for measuring CB
  • and item is biased if:
    1. people belonging to diff groups respond in different ways
    2. if it can be shown hat these differences are not due to group differences but w the construct of interest eg males consistently weigh more than females is not issue with the scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Predictive bias

A
  • when a test has different implications for one group vs another
  • The USE of a test, has diff implications for two/ more groups
  • APPLIED research + REGRESSION estimates
  • Numerical diff due to sampling fluctuations, PB represents systematic differences
    two steps
    1. determine whether test scores predict the dependent variable to start with
    2. determine whether this prediction is equal across groups
  • 3 external methods to measure PB using bivariate regression
  • y = b0 + b1(x)

measuring predictive bias assumes
- DIFF groups SHARE a COMMON regression equation
- NOT biased= COMMON regression equation / ONE regression equation SHOULD be equally applied to diff groups / common intercept + common slope
to test this
- estimate separate regression equations for each group + compare the group-level regression equation with the common regression equation
- If group-level do not correspond to common regression equation = Test bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Methods for detecting construct bias
item discrimination index

A

item Discriminate Index = d
- degree which an item is related to the total test score
- Reflects the structure of associations among items
- High = when peoples answer the item correctly means they tend to do better on test as whole compared to those who answered the item wrong
- Strong d suggest item is highly similar conceptually to most other test items

example
- divide the scores of an item from a mechanical aptitude test into two groups, one with high scores / one with low scores
- If the test does reflect mechanical aptitude would expect high proportion of high scores to answer it correctly and a relatively low proportion of low scorers answer it correctly
- Indicates the item strongly discriminates amongst people with varying aptitude levels + reflects well the construct being assessed

  • to assess for construct bias same concept except separate groups into eg male and female
  • Compare item discrimination index separately for two groups if indices are approx.. same = Absence of CB, suggest item reflects the construct is the same for both genders
  • Can remove item / modify to neutralise the bias
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Methods for detecting construct bias
factor analysis

A

Factor Analysis (PCA)
– Maximum likelihood estimation
- compare the number of dimension present for each group separately
- if same no of dimension across groups = no CB
- E.g might be males for mechanical aptitude have clear unidimensional structure
- Would then do a PCA on females response + if also found unidimensional structure = does NOT suffer from construct (if multidimensional = DOES have construct bias)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Methods for detecting construct bias
differential item functioning
uniform and non-uniform bias

A

DIF = differential item functioning
- estimate trait levels (participants’ true scores) directly from the test data within the context of item Response Theory (IRT)
- IRT: based on the Idea there’s a mathematical function relating a person’s trait level to the probability they will answer a q correctly
- If know a group of people’s trait levels then can generate an item characteristic curve (ICC) showing this function for each item
- x axis = z scores of respondents’ trait levels
- y axis = probability of answering item correctly
- If have 2 groups can plot the ICCs separately for each group

Uniform bias
- ICC differ in location but not in shape
- eg male who scores 1SD above mean has 60% chance of getting it right by a female has 90

non-uniform bias
- ICC differ in shape and location
- at some levels of SD away from the mean men score > than females
- at other levels of SD away from the mean females score > than men

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Methods for detecting construct bias
(spearman) rank order-consistency

A

Rank Order Consistency
– Correlation CB
- Calculate the means associated with each item separately for each group
- Then calc SPEARMAN Rank Correlation between the item means i.e. ‘item difficulties’
- Low corr = Construct bias (less than .80)

  • Require a relatively large sample for congruence coefficient analysis esp when the analyses are conducted at the item level
  • Minimum N = 300 in each group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

methods for detecting predictive bias
Intercept
slope
intercept and slope bias

A
  1. intercept bias
    - same slope diff y-intercept
    - start in different places, same shape/angles
  2. Slope Bias
    - same y-intercept different slope
    - start in same place but Diverge quickly
    Example Calculated
  3. Intercept + Slope Bias
    - rare
    - complex interaction when slope and intercept co-occur

Assessing Slope / Intercept SS
- set confidence intervals to 84%
- if the upper bound of the lower mean OVERLAPS with the lower bound of the higher mean then no evidence for PB
- if there is NO OVERLAP (significantly different means) then evidence for PB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

effect of Reliability on Bias

A
  • One group may have less reliable scores than the other group and this can cause differences in slopes and intercepts
  • this is construct bias which can impact predictive bias
  • Hence improving reliability can improve predictive bias
How well did you know this?
1
Not at all
2
3
4
5
Perfectly