Chapter 5 Flashcards

1
Q

a proportion that indicates the ratio between the true score variance on a test and the total
variance

A

reliability coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A statistic useful in describing sources of test score variability is the _______

A

variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variance from true differences is _______

A

true variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

variance from irrelevant, random sources
is ________________

A

error variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

refers to the
proportion of the total variance attributed to true variance.

A

reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

refers to collectively all of the factors associated
with the process of measuring some variable, other than the variable being measured.

A

measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

is a source of error in measuring a targeted variable caused by
unpredictable fluctuations and inconsistencies of other variables in the measurement process.

A

Random error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

refers to a source of error in measuring a
variable that is typically constant or proportionate to what is presumed to be the true value of
the variable being measured.

A

systematic error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

terms that refer to variation among items within a test as well as to
variation among items between tests

A

item sampling or
content sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

is an estimate of reliability obtained by correlating pairs of scores
from the same people on two different administrations of the same test

A

Test-retest reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

the estimate of test-retest reliability is often referred to as the ______

A

coefficient of
stability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The degree of the relationship between various forms of a test can be evaluated by means of an alternate-forms or parallel-forms
coefficient of reliability, which is often termed the ________

A

coefficient of equivalence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

refers to an estimate of the extent to which item sampling and other errors have affected test scores on versions of the same test when, for each form of the test, the means and variances of observed test scores are equal.

A

parallel forms
reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

are simply different versions of a test that
have been constructed so as to be parallel.

A

Alternate forms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

refers to an estimate of the extent to which these different forms of the same test have been affected by item sampling error, or other error.

A

alternate forms reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

is obtained by correlating two pairs of scores obtained
from equivalent halves of a single test administered once.

A

split-half reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

This method yields an estimate of split-half
reliability that is also referred to as _________

A

odd-even reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

refers to the degree of correlation among all the
items on a scale.

A

Inter-item consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test.

A

Spearman–Brown formula

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

is the degree to which
a test measures a single factor.

A

homogeneity

17
Q

may be thought of as the mean of all possible split-half correlations, corrected by the Spearman–Brown formula.

A

coefficient alpha

18
Q

describes the degree to which a test
measures different factors.

A

heterogeneity

19
Q

as a measure used to evaluate the internal consistency of a test that focuses on the degree of difference that exists between item scores.

A

average proportional distance method
(APD)

20
Q

Homogeneity VS heterogeneity of test items (essay)

A

Recall that a test is said to be homogeneous
in items if it is functionally uniform throughout. Tests designed to measure one factor, such as one ability or one trait, are expected to be homogeneous in items. For such tests, it is reasonable to expect a high degree of internal consistency. By contrast, if the test is heterogeneous in items, an estimate of internal consistency might be low relative to a more appropriate estimate of test-retest reliability.

21
Q

is the degree of agreement or consistency between two or
more scorers (or judges or raters) with regard to a particular measure.

A

inter-scorer reliability

22
Q

is a trait, state, or ability presumed to be ever-changing as a function of situational and cognitive experiences.

A

dynamic characteristic

23
Q

ability presumed to be relatively unchanging is ______ such as
intelligence.

A

static characteristic

24
Q

if some items are so difficult that no test-taker is able to obtain a perfect score, then the test is a _____

A

power test

25
Q

generally contains items of uniform level of difficulty (typically uniformly low) so that, when given generous time limits, all test-takers should be able to complete all the test items correctly

A

speed test

25
Q

is designed to provide an indication of where a test-taker stands with respect to some variable or criterion, such as an educational
or a vocational objective

A

criterion-referenced test

26
Q

a value that according to classical test theory genuinely reflects an individual’s ability (or trait) level as measured by a particular test.

A

true score

26
Q

also referred to as the true score (or classical) model of measurement. _________ is the most widely used and accepted model in the psychometric literature today

A

classical test theory (CTT)

27
Q

seek to estimate the extent to which specific sources of variation under defined conditions are contributing to the test score.

A

domain sampling theory

28
Q

is based on the idea that a person’s test scores vary from testing to testing because of variables in the testing situation.

A

generalizability theory

29
Q

Cronbach encouraged test developers and researchers to describe the details of the particular test situation or ______ leading to a specific test score.

30
Q

examines how generalizable scores from a particular test are if
the test is administered in different situations.

A

generalizability study

30
Q

include things like the number of items in the test, the amount of training the test scorers have had, and the purpose of the test administration.

31
Q

These coefficients are similar to reliability coefficients
in the true score model.

A

coefficients of generalizability.

32
Q

developers examine the usefulness of test
scores in helping the test user make decisions.

A

decision study

33
Q

Another alternative to the true score model is ________

A

Item response theory (IRT)

34
Q

a synonym for Item response theory (IRT) in the academic literature is _____

A

latent-trait theory.

35
Q

is a categorical variable with two possible response values (Yes/No, Agree/Disagree, Success/Fail).

A

dichotomous item

36
Q

is a categorical variable ordinal or nominal with more than two possible values (e.g. strongly disagree, disagree, agree, strongly agree).

A

polytomous item

37
Q

is a reference to an IRT model with very specific assumptions about the underlying distribution

A

Rasch model

38
Q

is the tool used to estimate or infer the extent to
which an observed score deviates from a true score.

A

standard error of measurement

39
Q

a range or band of test scores that is likely to contain the true score.

A

confidence interval

40
Q

Comparisons between scores are made using the _________

A

standard error of the difference

41
Q

refers to a group of personality tests.

A

Personality test battery