1. internal consistency 2. alternate forms 3. test-retest 4. interrater agreement

Exam 2 Flashcards by Allison Dart

consistency in measurement of some (real or hypothetical) characteristic

reliability

How well did you know this?

Not at all

Perfectly

the theoretical number that it one’s perfectly accurate representation of knowledge as a score on a test

True Score

How well did you know this?

Not at all

Perfectly

a person’s true score, plus some error

observed test score

How well did you know this?

Not at all

Perfectly

unsystematic variability introduced into scores; random; can cause deviation from true score

error

How well did you know this?

Not at all

Perfectly

Sources of Psychological Measurement Error

test construction
test administration
test scoring

How well did you know this?

Not at all

Perfectly

items are created/selected from a large population of possible items from within the domain
may affect how well one performs on a test. Someone may perform very well on some subtests and bad on others.

content sampling

How well did you know this?

Not at all

Perfectly

factors that influence the test taker’s attention, concentration, motivation, etc.

test administration

How well did you know this?

Not at all

Perfectly

physical appearance, departure from test standardization procedure, not placing materials in proper orientation, incorrect timing, etc.

examiner influences

How well did you know this?

Not at all

Perfectly

test environment, test anxiety, medication effects, extended testing session - fatigue

physical or psychological discomfort

How well did you know this?

Not at all

Perfectly

estimates of the ratio of true score variance to total variance (cannot be negative)

reliability coefficients

How well did you know this?

Not at all

Perfectly

test-retest/stability, interrater agreement. internal consistency, alternate forms, etc.

types of reliability

How well did you know this?

Not at all

Perfectly

reliability indices should meet or exceed

.85 or .90

How well did you know this?

Not at all

Perfectly

index of how an individual’s scores may vary over tests presumed to be parallel

standard error of measurement (SEm)

How well did you know this?

Not at all

Perfectly

as rxx increases from 0-1…

SEm decreases from SD to 0

How well did you know this?

Not at all

Perfectly

Purpose of rxx

internal consistency
alternate forms
test-retest
interrater agreement

How well did you know this?

Not at all

Perfectly

static vs. dynamic characteristic

static doesn’t change very much

dynamic evolves a lot

How well did you know this?

Not at all

Perfectly

degree to which evidence and theory support the interpretations of test scores for proposed uses of tests
relates to inferences or interpretations made about performance based on scores from the measure.

validity

How well did you know this?

Not at all

Perfectly

content validity
criterion related validity
construct validity

trinitarian model

How well did you know this?

Not at all

Perfectly

Evidence based on:

test content
response processes
internal structure
relations with other variables
consequences of testing

Study These Flashcards

unified theory (Messick)

examination of test content

Study These Flashcards

content validity

test-retest
alternate forms
interrater agreement

Study These Flashcards

Methods of Reliability

Rα

Study These Flashcards

Cronbach’s coefficient alpha

when an item belongs to more than one factor

Study These Flashcards

cross loading

examination of test content

Study These Flashcards

content validity

- test-retest - alternate forms - interrater agreement

Methods of Reliability

Rα

Cronbach's coefficient alpha

when an item belongs to more than one factor

cross loading

z scores for 68% 95% 99%

1 1. 96 2. 58

Which reliability estimate should be used with test-retest?

stability

SEdiff formula

SEdiff = SD√2-rxx1-rxx2 Rxx1 = reliability of test 1 Rxx2 = reliability of test 2 answer is the number of points required to make a SIGNIFICANT DIFFERENCE

Which reliability estimate should be used with internal consistency?

coefficient alpha, KR-20, split-half

Which reliability estimate should be used with alternate forms?

equivalance

Which reliability estimate should be used with test-retest?

stability

compare a test in question with an already accepted standard measure

concurrent validity

when correlations are superficially high because two tests are too similar

criterion contamination

test in question is compared to a standard gathered at a future time; predicts performance on a standard at a future time ex: hs GPA and ACT predicting college GPA

predictive validity

experimental procedures that cause a change in construct can be used to test a test's validity

theory consistent intervention effects

two or more measures designed to measure the same construct should produce high correlations (large amount of shared variance)

convergent validity

two or more measures designed to measure different constructs should produce low/near zero correlations (little shared variance)

divergent validity

use scores to classify subjects into groups

discriminant analysis

conceptualization

is there a need? who wil it be used for? content? administration? responses?

scaling

how we assign "points" to responses

try-out, analyze, revise

first "test" administration, analysis, revision

Spearman brown

Used to estimate reliability after changing test length

KR-20

Used with dichotomous

Exam 2 Flashcards

(45 cards)