Reliabiity, Validity Flashcards

1
Q

assumes that each person has a true score that would be obtained if there were no errors in measurement

A

classical test score theory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

assumes that the items that have been selected for any one test are just a sample of items from an infinite domain of potential itesms

A

domain sampling theory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

process of choosing test items that are appropriate to the content domain of the test

A

domain sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

another central concept in classical test theory wherein it considers the problems created by using a limited number of items to represent a larger and more complicated construct

A

domain sampling model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

using ____, the computer is used to focus on the range of item difficulty that helps assess an individual’s ability level

A

item response theory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

degree to which scores from a test are stable and results are consistent

A

reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ratio of the variance of the true scores on a test to the variance of the observed scores

A

reliability coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

test reliability is usually estimated in one of three ways:

A
  • test-retest method
  • parallel forms method
  • internal consistency method
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

consistency of the test results are considered when the test is administered on different occasions

A

test-retest method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

test across different forms of the test are evaluated

A

parallel forms method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

performance of people on similar subsets of items selected from the same form of measure is examined

A

internal consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

occurs when the first testing session influences scores from the second session

A

carryover effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

compares two equivalent forms of a test that measure the same attribute

A

parallel forms / equivalent forms reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

determined by dividing the total set of items relating to a construct of interest into halves and comparing the results obtained from the two subsets of items thus created

A

split-half reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

measure of internal consistency; considered to be a measure of scale reliability

A

coefficient alpha or cronbach’s alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

used to estimate the reliability of binary measurements

A

kuder and rischardshon formula 20

17
Q

takes into account chance agreement

A

kappa statistics

18
Q

allows you to estimated what the correlation between the two halves would have been if each half had been the length of the whole test

A

spearman-brown formula

19
Q

best method for assessing the level of agreement among several observers

A

kappa statistic

20
Q

agreement between a test score or measure and the quality it is believed to measure

A

validity

21
Q

3 types of evidence:

A
  • construct-related
  • criterion-related
  • content-related
22
Q

mere appearance that a measure has validity

A

face validity

23
Q

logical rather than statistical

A

content validity

24
Q

describes the failure to capture important components of a construct

A

construct underepresentation

25
Q

occurs when scores are influenced by factors irrelevant to the construct

A

construct-irrelevant variance

26
Q

tells us just how well a test corresponds with a particular criterion

A

criterion validity evidence

27
Q

standard against which the test is compared

A

criterion

28
Q

SAT is the predictor and GPA is the criterion

A

predictive validity evidence

29
Q

correlation expressing the relationship between a test and a criterion

A

validity coefficient

30
Q

established through a series of activities in which a researcher simultaneously defines some construct and develops the instrumentation to measure it

A

construct validity evidence

31
Q

obtained when a measure correlates well with other tests to measure the same construct

A

convergent evidence for validity

32
Q

standardized tests that are designed toc compare and rank test takers in relation to one another

A

norm-referenced test

33
Q

process of evaluating the learning of students against a set of pre-specified qualities or criteria, without reference to the achievement of otehrs

A

criterion referenced test