Lec2 - Ch7 The importance of reliability Flashcards

1
Q

Why is reliability important?
What factors help us evaluate a test score?

A
  • reliability is crucial to evaluate correctly a test score
    > point estimate
    > confidence interval
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Point estimate
- what is it?

A
  • specific value
  • “best estimate” of an individual’s true score
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what kinds of point estimates can we find?

A
  • observed score as estimation of true score
  • adjusted true score estimate
    > takes error measurement into account
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Regression to the mean
- what is it?
- when does it occur?

A
  • phenomenon occurring when measuring an adjusted true score estimate
  • extreme scores in first measurement will be closer to the mean in second measurement, due to measurement error
    ! prediction based on measurement error being random
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what do the size and direction of the discrepancy depend on?

A
  • reliability of test scores
  • extremity of the individual’s observed scores
  • direction of the difference between the observed score and the mean of those scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how can the adjusted true score estimate be calculated?

A
  • see picture 1
    = mean of observed scores + reliability of test scores x (individual observed score - mean observed score)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what factors affect the adjusted true score estimate?

A
  • reliability
  • extremity of observed score
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how does reliability affect the adjusted true score estimate?
why?

A
  • as reliability decreases, the difference between adjusted true score estimate and observed score increases
    = poor reliability produces bigger discrepancies between observed score and adjusted true score estimate
    > this is because a test with low reliability has much measurement error, and measurement error increases the regression to the mean, therefore increasing the difference between true estimates and observed scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how does the extremity of observed scores affect the adjusted true score estimate?

A
  • it affects the differences between observed scores and true estimations
  • a more extreme observed score will have bigger regression to the mean, leading to a bigger discrepancy with the true score estimation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

why do we need caution when calculating the regression to the mean?

A
  • there might be no reason to calculate adjusted true score estimates
  • regression to the mean is not always a mathematical certainty on the long run
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Confidence intervals

A
  • reflect the accuracy of the point estimate
  • when high reliability, more precise estimates of true scores
  • highly reliable tests produce narrower confidence intervals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how can we calculate the link between reliability and precision of the point estimate?

A
  • through standard error measurement (SEm)
  • see picture 2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how can a 95% confidence interval be computed?

A

-see picture 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the debates about confidence intervals on?

A
  • what confidence interval to use
  • precise definition of C.I.
  • whether to compute them with standard measurement error or standard estimate error
  • whether to apply them to observed score estimates of true scores or adjusted true score estimates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the distribution of observed scores?

A
  • according to the true scores theory, observed scores are distributed normally around true scores
  • the observed score represents the mean of this distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

summary 1
- reliability and accuracy

A
  • reliability affects confidence, accuracy and precision of true score estimation
  • reliability affects standard measurement error
    → S.e.m. affects the width of a confidence interval around true score estimation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what does the correlation between observed scores on two measures depend on?

A
  • correlation between the true scores of the two constructs assessed by the two measures
  • reliabilities of the two measures
    !! the correlation between two variables is the covariance divided by two standard deviations
  • see picture 4
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q
  • how is the correlation between observed scores compared to the one between the constructs?
  • why?
A
  • the correlation between observed scores is always smaller than the one between the constructs (true scores)
    → this is due to measurement error
  • see picture 5
19
Q

what are the implications of the connection between reliability and observed correlations?

A

-see picture 6
1- observed associations are always weaker than the true ones
> measurement is never perfect → measures are never perfectly reliable
> imperfect measurements weaken observed associations
2- the degree of attenuation is determined by the reliabilities of the measures
> poorer reliability → bigger attenuation
! observed correlation can be extremely attenuated even if only one measurement has poor reliability
3- error constraints the maximum association that could be found between two measures
4- you can estimate the true association between a pair of constructs

20
Q

correlation for attenuation

A
  • see picture 7
  • computes the correlation that would be obtained without attenuation (it separates the measurement error)
21
Q

what are the two main statistical concepts that reliability influences?

A
  • effect sizes
  • statistical significance
22
Q

what are effect sizes?

A
  • values that represent the results of a study in the form of a degree
  • e.g. regression coefficients, odds ratios and Cohen’s d
    ! they are affected directly by measurement error and reliability
23
Q

what are the three most common effect sizes?
what do they represent?

A
  • correlations
    > association between two continuous variables
  • Cohen’s d
    > association between a dichotomous variable and a continuous variable
  • η2
    > association between a categorical variable (with more than two levels) and a continuous variable
24
Q

Cohen’s d

A
  • difference between two groups (standard deviation units on the dependent variable)
    > minimum is 0 (no difference between the two groups’ mean levels of the construct)
    > maximum is unlimited (usually around 1.5)
25
Q

what does Cohen’s d depend on?

A
  • true value of Cohen’s d
  • reliability of the measure of the continuous variable
26
Q

how are the distributions of observed scores and true scores different?

A
  • measurement error creates variance, therefore the distribution of true scores is narrower than the one of observed scores
    → the wider observed scores distributions create overlap between the two groups
    → this overlap obscures the differences between the groups
    → Cohen’s d value is smaller (for the observed values), because of the obscured differences
    = large variance among observed scores reduces the observed effect size compared to the true effect size
  • see picture 8
27
Q

statistical significance
- what does it depend on?

A
  • related to a researcher’s confidence in the result
  • it depends on the effect size of the observed scores
28
Q

how does reliability influence statistical significance?

A
  1. reliability influences effect size
  2. effect size influences statistical significance
    - e.g. low reliability → small effect size → low statistical significance
29
Q

what are the implications of reliability affecting the research’es results?

A

-* see picture 9*
- researchers should always consider reliability when interpreting effect sizes and statistical significance
- researchers should try to use highly reliable measures in their work
> some measures are difficult to obtain and expensive
- researchers should report reliability measures in their studies

30
Q

what is particularly important when constructing and refining a test?

A
  • items means
  • item variances
  • item discrimination
    ! items should enhance test’s internal consistency
31
Q

internal consistency
(reminder)

A
  • degree to which differences among persons’ responses to one item are consistent with differences among their responses to other items in the test
  • intrinsically linked to the correlations among its items
    → therefore, important to know which items enhance or decrease the correlation (Interitem Correlation Matrix)
32
Q

what should we pay attention to in an interitem correlation matrix?

A
  • Corrected item-Total correlation
  • Cronbach’s alpha if item deleted
  • (general) Cronbach’s alpha
33
Q

How can we evaluate the consistency of items?

A
  • Interitem correlation matrix
  • item discrimination (+ item-total correlation)
  • discrimination index (D) for binary items
34
Q

Item discrimination

A
  • degree to which item differentiates people who score high from people who score low (on the total test)
  • items with high discrimination values are better for reliability
35
Q

Corrected item-total correlation

A
  • index of item discrimination
    1. compute total score of test
    2. correct total test score by subtracting the individual item score (!)
    3. compute correlation between item and total score
36
Q

Discrimination Index (D)
- how to calculate it

A
  • it compares the proportions of high test scorers and low test scorers, who answer item correctly
    1. identify percentage of people with highest and lowest total score
    > percentage is arbitrary
    2. calculate proportion of people per group that answer item correctly
    3. calculate difference between two proportions
37
Q

what discrimination index is better?

A
  • ranges from 0 to 1
  • items are better when they have a high Discrimination index (big difference between high and low scorers when answering item)
38
Q

R^2

A
  • other index used to evaluate item’s consistency
  • obtained when predicting scores on each item from scores on other items
  • range from 0 to 1 (larger value preferred)
    ! fails to differentiate between positive and negative associations among items
39
Q

what other characteristics of items could influence the internal consistency of the test?

A
  • item difficulty mean
  • item variance
39
Q

Cronbach’s Alpha if Deleted Item

A
  • other index used to evaluate item’s consistency
  • it tells us the estimation of reliability that we would obtain if we were to delete said item
40
Q

what is the importance of the item’s variance?

A
  • an item’s variance has implications on interitem correlations, item-total correlation, and “alpha if deleted” value
  • items that all respondents answer equally, don’t contribute to the reliability of the test
41
Q

what is the item’s mean difficulty?
what mean difficulty is ideal?

A
  • if 80% of respondents answer item 1 correctly, then the mean of item 1 is .80
    > this means that this item is relatively easy
  • the ideal mean difficulty is .50
    > this ensure maximal item variability
42
Q

what must we pay attention when we encounter a counterindicative item?

A
  • the score must be reversed
    > e.g. if they score a 5/5, we must write it down as a 1