Ch 3 Flashcards

Question 1

Q

Define Reliability

Answer

A

The consistency or reproduce-ability of test scores or assessment data

When an assessment is reliable, this means that my results are dependable and meaningful

Also: the ability to test scores to be interpreted in a consistent and dependable manner across multiple administrator

Question 2

Q

Caveats when considering the reliability of an instrument

Answer

A

1) reliability scores refer to the RESULTS produced not the test itself
2) Just because an instrument is ONE type of reliable, doesn’t mean that it is all types of reliable
3) Results from tests are rarely consistent all of the time

Question 3

Q

Classical Test Theory

Answer

A

aka (TRUE SCORE MODEL)

X =T + E
(Achieved score = true score (raw score) plus random error)

-Model can be used to test the reliability, difficulty and discriminatory properties of test items/scales

As test administrators: out responsibility to limit measurement error as best we can

Question 4

Q

What are the two types of measurement error?

Answer

A

Systematic - when test consistently measures something other that what it’s supposed to (aka imperfect construct validity) e.g. computer skills interfering with assessing math skills
Unsystematic - aka random error - collection of factors that contribute to variation in scores including test construction, administration, scoring
- could also be related to individual characteristics of test-taker

Question 5

Q

What are the sources of measurement error?

Answer

A

CATT (Cats can measure)
(Content Administration Time Test-Taker)

Time Sampling Error (3)
Content Sampling Error
Test Administration Error
Test Taker Variables

Question 6

Q

Define Time Sampling Error

(source of measurement error)

Answer

A

Results from repeated administrations of test to same person

Will largely depend on construct being assessed:

personality = stable
emotional state = more variable
CONSIDER how likely the construct is to vary naturally
CONSIDER time interval between administrations

Question 7

Q

What three factors may impact time sampling error?

Answer

A

Carryover effect: when score on first admin impacts scores on subsequent administrations
Practice effect: when scores improve because test-taker become more familiar/comfortable with content being assessed
Fatigue: performance may decrease as taker becomes tired of repeated testing

Question 8

Q

Content Sampling Error

Source of measurement error

Answer

A

Aka Domain Sampling error

When test items don’t fully reflect the construct being measured

really hard to capture all the constructs the test was designed to capture
MOST COMMON source of error observed in tests scores

Question 9

Q

Test Administration Error

Source of measurement error

Answer

A

deviation from test protocol

- unforseen events that occur within the testting environment e.g. power outage, fire drill, etc

Question 10

Q

Test Taker Variables

sources of measurement error

Answer

A

Individual difference in test takers that can’t be accounted for by administrator
Can include motivation, fatigue, anxiety, ability, illness, mood, etc

Question 11

Q

How is reliability measured?

Answer

A

The Correlation Coefficient

This number indicates the strength of the relationship between the the variance of the true scores and the variance of observed score

V of Tscore/V of Oscore

Question 12

Q

How do you interpret a correlation coeffecient score?

measure of reliability

Answer

A

.83

–> 83% of score = variance of REAL (TRUE) score, not measurement error

-1 = Perfect negative correlation
0 = no correlation
\+1 = Perfect positive correlation

Question 13

Q

Is -.95 or + .85 a stronger correlation?

Answer

A

-.95

Remaining percentage = error

Question 14

Q

How is reliability assessed/estimated?

Answer

A

a) Test-retest
b) Alternate forms
c) Internal consistency
d) inter-rater reliability

Question 15

Q

Test-Retest Reliability

way to estimate reliability

Answer

A

**MOST COMMON

Used to assess how stable/reliable a score is over time
Set of participants is tested using the SAME test on two separate occasions

*Carryover effect can be significant

Question 16

Q

Alternate Forms Reliability

way to estimate reliability

Answer

A

Uses different but equivalent versions of a test

Also called Item sampling (all items are from the same poll of questions)

Forms should have similar means, variances, item difficulty and correlations with other measures

Can be a good way to overcome limitations of test-retest analysis

Question 17

Q

Internal Consistency

way to estimate reliability

Answer

A

Goal: to see if items in a test are consistent with each other; do they represent a singular construct?

Split-half reliability
Kurder-Richardson
Cronbach’s alpha

Question 18

Q

Split-half reliability

way to measure internal consistency

Answer

A

Divide test into two comparable halfs

Calculate the correlation between the result of the two halves using:

Spearman-Brown prophecy formula

SPLIT halve= SPEARMAN -brown

Question 19

Q

Kurder-Richardson formulas

internal consistency measure

Answer

A

-Uses a statistical process to determine split-half reliability

KR20-actual scores on each item
KR21- mean scores on each item

Can only be used with dichotomous response sets (T/F)

Question 20

Q

Cronbach’s alpha

Answer

A

Also: Coefficient Alpha

Can be used with Likert (nondictomous data)

Question 21

Q

Inter-rater reliability

Answer

A

Assess reliability of test-administrators

Level of agreement/level of disagreement

Question 22

Q

How to interpret a reliability coeffecient

Answer

A

All the methods of estimating reliability will give you a reliability coefficient between -1 to +1

Best: .90 Acceptable .80

Very high: .90
High: .80
Acceptable: .70
Questionable: 0.60
Unacceptable: .less than .59

Question 23

Q

What is Standard Error or Standard Error of Measurement?

Answer

A

The standard deviation of a normal curve

True score of any test is likely to fall between -2 and +2 SEM (95%) Confidence interval

The smaller the SEM ->less variance in score->higher degree of reliability

Inverse relationships between SEM and reliability

Question 24

Q

How to increase reliability of a test

4

Answer

A

Lo PH
Length Optimal time Population Heterogenity

(Improve reliability and reduce measurement error)

increase test length
- more questions = more reliable
- increases internal consistency (as long as good questions are added)
Make sure test is designed for population you want to use it with
- age, vocabulary, education level
Increase heterogenity
- the more similar test takers are, the more similar the test scores will be
Use Optimal Time interval between tests
- time interval between tests plays a huge role in how reliable results are

Brainscape's Knowledge GenomeTM

Ch 3 Flashcards

Brainscape's Knowledge Genome^TM