Chapter 5 Flashcards by Mikhaela Pilar

refers to something that produces similar results

reliability

How well did you know this?

Not at all

Perfectly

a statistic that quantifies reliability which ranges from 0 to 1

reliability coefficient

How well did you know this?

Not at all

Perfectly

what is the score for not reliable at all

zero (0)

How well did you know this?

Not at all

Perfectly

what is the score for perfectly reliable

one (1)

How well did you know this?

Not at all

Perfectly

the individual’s score on a measure if there was no error

true score

How well did you know this?

Not at all

Perfectly

a person’s standing on the theoretical variable independent of any particular measurement

construct score

How well did you know this?

Not at all

Perfectly

______ : reliability :: _____ : validity

true score; construct score

How well did you know this?

Not at all

Perfectly

formula for observed score

X = T + E

How well did you know this?

Not at all

Perfectly

refers to the difference between the observed score and the true score

error

How well did you know this?

Not at all

Perfectly

standard deviation squared

variance

How well did you know this?

Not at all

Perfectly

variance equals _____ plus _____

true variance; error variance

How well did you know this?

Not at all

Perfectly

the proportion of the total variance attributed to true variance

reliability

How well did you know this?

Not at all

Perfectly

percentage of true variance

67%

How well did you know this?

Not at all

Perfectly

percentage of error due to test construction

18%

How well did you know this?

Not at all

Perfectly

percentage of administration error

How well did you know this?

Not at all

Perfectly

percentage of unidentified error

How well did you know this?

Not at all

Perfectly

percentage of scorer error

How well did you know this?

Not at all

Perfectly

refers the inherent uncertainty associated with any measurement, even after care has been taken to minimize preventable mistakes

measurement error

How well did you know this?

Not at all

Perfectly

consists of unpredictable fluctuations and inconsistencies of other variables in the measurement process

random error

How well did you know this?

Not at all

Perfectly

typically proportionate to what is presumed to be the true value of the variable being measured

systematic error

How well did you know this?

Not at all

Perfectly

sources of error variance

test construction
test administration
test scoring and interpretation

How well did you know this?

Not at all

Perfectly

what are the variables under test administration source of error variance

test taker variables
examiner-related variables

How well did you know this?

Not at all

Perfectly

sources of error variance: variation may exist within items in a test or between tests

test construction

How well did you know this?

Not at all

Perfectly

sources of error variance: may stem from the test environment

test administration

How well did you know this?

Not at all

Perfectly

sources of error variance: pressing emotional problems, physical discomfort, lack of sleep, and the effects of drugs or medication

test taker variables

sources of error variance: physical appearance and demeanor may play a role

examiner-related variables

sources of error variance: computer testing reduces error in test scoring, but many tests still require expert interpretation

test scoring and interpretation

reliability estimates

- test-retest reliability - split-half reliability - inter-scorer reliability

an estimated of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test

test-retest reliability

true or false: in test-retest reliability, as time passes, correlation between the scores obtained on each testing increases

false; decreases

this reliability is most appropriate for variables that should be stable over time, such as personality

test-retest reliability

with intervals greater than 6 months, the estimate of test-retest reliability is called _____

coefficient of stability

measures the degree of the relationship between various forms of a test by means of alternate-forms or parallel-forms

coefficient of equivalence

for each form of the test, the means and the variances of observed test scores are equal

parallel forms

typically designed to be equivalent with respect to variables such as content and level of difficulty

alternate forms

obtaining estimates of alternate-forms reliability and parallel-forms reliability is similar to obtaining an estimate of _____

test-retest reliability

obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once

split-half reliability

step 1 of split-half reliability

divide the test into equivalent halves

step 2 of split-half reliability

calculate a pearson r between scores on the two halves of the test

step 3 of split-half reliability

adjust the half-test reliability using the spearman-brown formula

allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test

spearman-brown formula

the degree of correlation among all the items on a scale

inter-item consistency

mean of all possible split-half correlations

coefficient alpha

range of coefficient alpha

0 to 1

coefficient alpha is corrected by what formula

spearman-brown formula

the degree of agreement or consistency between two or more scorers with regard to a particular measure

inter-scorer reliability

what reliability is often used when coding nonverbal behavior

inter-scorer reliability

a correlation coefficient used to determine the degree of consistency among scorers

coefficient of inter-scorer reliability

the nature of tests

- homogeneity vs. heterogeneity of items - dynamic vs. static characteristics - restriction of range vs. inflation of range - speed test vs. power test

estimates the portion of a test score that is attributable to error

true score theory or classical test theory

averaging all the observed scores obtained over a period of time, the result would be closest to the true score

true score theory or classical test theory

the greater the number of items, the higher the reliability

true score theory or classical test theory

estimates the extent to which specific sources of variation under defined conditions are contributing to the test score

domain sampling theory

assumes that the items that have been selected for any one test are just a sample of items from an infinite domain of potential items

domain sampling theory

based on the idea that a person's test scores vary from testing to testing because of variables in the testing situation

generalizability theory

it is described in terms of its facets

universe

provides a way to model the probability that a person with x ability will be able to perform at a level of y

item response theory

refers to a family of methods and techniques used to distinguish specific approaches

item response theory

incorporates considerations of an item's level of difficulty and discrimination

item response theory

relates to an item not being easily accomplished, solved, or comprehended

difficulty

refers to the degree to which an item differentiates among people with higher or lower levels of the variables being measured

discrimination

the _____ the reliability of the test, the _____ the standard error

higher; lower

the higher the _____ of the test, the lower the _____

reliability; standard error

can be used to estimate the extent to which an observed score deviates from a true score

standard error

a range or band of test scores that is likely to contain the true score

confidence interval

percent between +-1 sd

68.3%

percent between +-2 sd

95.4%

percent between +-3 sd

99.7%

percent 1 sd from the mean

34.1%

percent 2 sd from the mean

13.6%

percent 3 sd from the mean

2.1%

when a time limit is long enough to allow test takers to attempt all items, and if some items are so difficult that no test taker is able to obtain a perfect score

power test

generally contains items of uniform level of difficulty so that, when given generous time limits, all test takers should be able to complete all the test items correctly

speed test

designed to provide an indication of where a test taker stands with respect to some variable or criterion

criterion-referenced test

examines how generalizable scores from a particular test are if the test is administered in different situations

generalizability study

error that is unpredictable

random error

error that is expected so you have prepared for it

systematic error

summary of test-retest reliability

test-retest = different administrations - coefficient of stability

summary of parallel/alternate forms

parallel/alternate forms = different forms - coefficient of equivalence

summary of split-half reliability

split-half reliability = different halves of the test - person r (correlation) & spearman brown (adjustment)

score for reliability

0.8 and above

all items measure only one construct

homogenous items

all items measures lots of constructs

heterogenous items

in which items is internal consistency high

homogenous items

what coefficient removes biases in scoring

coefficient of inter-scorer reliability

analyze correlation at a specific range; high reliability

restriction of range

looks at the whole picture; low reliability

inflation of range

what is measured in power test

ceiling and floor limits

highest that you can analyze

ceiling limits

lowest that you can understand

floor limits

Chapter 5 Flashcards

(90 cards)