Lecture 2 + Wa Flashcards

1
Q

What are observe scores, true scores and measurement error?

A

The scores on a test to measure a certain ability or characteristic

The actual levels of a certain ability or characteristic people have

The effect by random factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

(…) minus the error score equals (…) what should be on the dots (…)?

A

observed score and true score, respectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Johns couch is 200 centimeters wide. He measures it with a measuring tape and finds it to be 205 centimeters.

What is the observed score, true score and error score here?

A

observed = 205, true = 200 and error = 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the central idea in the classical test theory (ctt)?

A

Every test taker has a true score on a test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why do observed scores not match true scores in practice?

A

Because measurement error exists and changes the observed score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are two core assumptions of classical test theory?

A
  1. Observed scores are true scores plus measurement error, 𝑋𝑜 = 𝑋𝑡 + 𝑋𝑒
  2. Measurement error is random

you do not need to know the formulas, they function as a support for basic understanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What follows from the two core assumptions of CTT?

A
  1. 𝑋𝑒 = 0 (mean of the measurement error is equal to 0) this is because error is random (nonzero = systematic)
  2. 𝑟𝑡𝑒 = 0 (the correlation between true score and error is equal to 0) I think because the true score contains no error (and also because the mean is 0 for all true scores?)
  3. observed score variance = true scores variance + error variance (note that variance is s^2)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Reliability can be defined in two ways, which?

A

Proportion of variance. This basically means that if there is a high proportion of noise (error) the reliability will be low, if there is a high proportion of signal (true score variance), the reliability will be high. Aka proportion/ratio of true score variance to observed score variance (which is also the formula)

Shared variance. Reliability defined as the correlation^2 btwn the observed and the true scores variance. A high correlation^2 equals a high reliability and vice versa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

There are three types of tests regarding the methods of multiple tests, what are these and explain?

A

Split-halves = you have two tests bc you split one in half
Test-retest = retest the same test
Cronbach and Omega = each item counts as their own test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

There are four models regarding the reliability of a test, which are these (do not explain)

A

Parallel, tau-equivalent, essentially tau-equivalent and congeneric test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the restrictions/implications of the parallel test model?

A

Restrictions = True mean scores of both tests are equal and error of both tests are equal

Implications = Mean of true scores and variance of true scores equal, as well as for the observed scores. Correlation between true scores =1 and reliability is equal

> also most restrictive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which tests’ reliability are based on the parallel test model?

A

Split-halves and test-retest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the restrictions/implications of the tau-equivalent test model?

A

restriction = true mean scores equal across both tests

implication = mean and variance of true scores are equal, but only the mean of the observed score is equal. Correlation btwn the true scores is equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the restrictions/implications of the Essential tau-equivalent test model?

A

restriction = mean true scores are not equal

implications = variance of true scores ARE equal and the correlation between the true scores = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which tests’ reliability are based on the Essential tau-equivalent test model?

A

Cronbach’s alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which tests’ reliability are based on the congeneric test model?

A

basically everything is different (except ofc like mean error score or smth) and correlation btwn true scores is still 1

> least restrictive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which tests’ reliability are based on the congeneric test model?

A

Omega

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are three methods of reliability estimation (CTT)? Explain themmm

A

Alternate forms (parallel model, apply two versions of same test, correlation = reliability)
test-retest (parallel, same test twice, correlation = reliability)
internal consistency (parallel or essential tau, blocks of items = test, some complicated formula = reliability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Challenges of alternate the three reliability estimations (CTT)?

A

Alternate = construction of test
test-retest = change in true scores
and above + internal consistency all have problems with carry over effects

20
Q

Within internal consistency there are three further types (idk how to describe this), which are these/explain?

A

Split half (parallel, two halves, formula = reliability)
Cronbach’s (kr20 for binary, essential tau, each item is a test. formula)
omega (congeneric model or stricter, true score variance = factor analysis and reliability = true score variance/observed score variance)

21
Q

Cotan guidelines about reliability?

A

High impact (individual) = Good: 0.9 or larger, sufficient btwn .8-.9 and insufficient <.8
Less impact (individual) = good >.8, suf btwn .7-.8 and ins <.7
Group = good >.7, suf btwn .6 and .7, ins <.6

22
Q

What is the corrected iten-total correlation?

A

correlation btwn the item scores and the rest scores (item discrimination)

23
Q

What is the item total correlation?

A

correlation btwn item scores and the sum scores (corrected one is better bc you correlate partly with itself)

24
Q

Which factors affect reliability?

A

Test length (^ length = ^ reliability)
sample heterogeneity (heteregenous samples ^ reliability bc ^ variance)
correlation btwn pre and posttest (large correlation btwn pre and post ↓ reliability) probs bc it would mean simmilair scores?

25
Q

Why is reliability being reliant on heterogeneity undesirable?

A

Reliability should be a property of the test, not the sample

26
Q

What does a small measurement error mean for the reliability?

A

It is higher

27
Q

What is attenuation?

A

reduction in effect

28
Q

What is Cohen’s d and what is the relationship w/reliability?

A

An effect size (number of sds that the groups differ), a less reliable test = smaller cohen’s d

This also means the group difference is less likely to be significant w/ less reliability

29
Q

What happens to correlation if reliability no good?

A

smaller correlation (again, less likely to be significant)

30
Q

Consider two tests that purport to measure the same construct. In a pilot study, a researcher finds their observed test score means to be the same, but their test score variances to not be the same. Which of the test models do these data follow?

A

Tau-equivalent tests

31
Q

Order the different test models on how restrictive they are from 1 (least restrictive) to 4 (most restrictive)

A

congeneric > essential tau > tau > parallel

32
Q

In a hypothetical dataset that contains the test scores on two tests, the true score mean and true score variance differ across the two tests. Which test model does this dataset follow?

A

The congeneric test model

33
Q

If you find the reliability of two tests measuring the same construct to be the same, what test model do these tests follow?

A

Parallel test model

34
Q

In a hypothetical dataset that contains the test scores on two tests, the true score mean and true score variance are equal across the two tests. Which test model does this dataset follow?

A

parallel AND tau

35
Q

Say you want to assess the consistency between the observed scores of one test and those of another test. Which method for estimating reliability do you use?

A

Alternate forms

36
Q

Which of the following criteria do two test forms need to meet, in order to legitimately use the alternate forms method of estimating reliability?

A

The tests need to have identical true scores and identical error variance

37
Q

Jimmy conducts a study into aggression, for which he uses the Aggression Questionnaire (AGQ; Buss & Perry, 1992). He wants to know how reliable the AGQ is. Therefore, he lets his respondents fill in the questionnaire again.

Which method of estimating reliability does Jimmy intend to use here?

A

Test-retest

38
Q

What problem(s) may occur if you estimate reliability using the test-retest method?

A

Internal consistency

39
Q

Indicate the properties of the three reliability measures below (raw alpha, standardized alpha, KR20)

A

Suitable for Likert scale items that do not differ in variance too much,
Suitable for Likert scale items that differ substantially in their item variance and suitable for binary items, respectively

40
Q

What can be said about the difference between the raw alpha, standardized alpha and KR20 procedure?

A

Raw alpha and KR20 are based on the item covariances and item variances; standardized alpha only uses item correlations

41
Q

How does consitency affect reliability?

A

^ consistency = ^ reliability

42
Q

A researcher wants to assess the effectiveness of an assertiveness training. To this end she first measures assertiveness in a sample of randomly selected subject (pretest). Next she administers the training and measures assertiveness again (posttest). To draw a conclusion about the effectiveness of the training, she calculates differences scores (the pretest scores minus the posttest scores). The pretest has a reliability of 0.8, and the posttest has a reliability of 0.8. The correlation between the pretest and posttest is large (around 0.9). Which statement below is correct?

A

The reliability of the difference scores will be far below 0.8

> Notes: bc difference scores depend on the correlation btwn pre and posttest

43
Q

If the pretest and posttest are both reliable, the reliability of the difference scores can still be relatively small if the pretest and posttest are…

A

… highly correlated

44
Q

A researcher develops a test to diagnose learning disabilities in children. He finds the following value for Omega: .732

According to the COTAN guidelines, how would you assess the reliability of this test?

A

insufficient

45
Q

Hank is doing research into job satisfaction. He wants to find out what the correlation is between work-life balance and working conditions. He uses two separate tests to measure these constructs. The work-life balance measure has an almost perfect reliability of 0.97. However, the working conditions measure has a quite low reliability of 0.39.

What can be said about the observed correlation between the two measures that Hank will probably find?

A

It will be largely attenuated, because one of the measures is quite poor, so even a very high true correlation will results in a moderate observed correlation

46
Q

Which statistic do we use when we want to know the consistency between one item and the other items of a test?

A

Corrected item-total correlation