Reliability & Validity Flashcards

1
Q

Reliability

A
  • Are the results consistent?

- Provides an estimate of the proportion of unsystematic error <—need to know the degree of to determine reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Validity

A
  • Does it measure what it says it measures?
  • Overall eval of evidence and degree of trustworthiness
  • Determine if enough support exists to use the test in a certain way
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Classical Test Theory

A
  • Observed score = T + E
  • T is the true score if the test is completely free from error
  • E is the error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Unsystematic Error

A
  • Random errors: mood, health, fatigue
  • Administration differences
  • Scoring differences
  • Random guessing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Systematic Error

A

Constant errors that occur every time tested, like a typo

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Reliability Related to Validity

A
  • High validity can occur if high reliability exists
  • High validity cannot occur if low reliability
  • High reliability does not suggest high validity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Correlation Related to Reliability

A
  • Correlation: Statistical technique used to examine consistency
  • Reliability is often based on consistency between two sets of scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Positive Correlation

A

As one increases, so does the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Negative Correlation

A

As one increases, the other decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Correlation Coefficient (Pearson-Product Moment)

A
  • Correlation coefficient: numerical indicator of the relationship between two sets of data
  • PPM correlation coefficient - most common
  • -1 to +1: closer to absolute value 1=stronger relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Test-Retest

A
  • Give same test twice to same group
  • Correlation between first and second administration (2-6 weeks away)
  • Possible influences: shorter gap, high correlation, changes in administration, interventions, practice test
  • Ex: skills-based test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Alternate Forms

A
  • Very difficult
  • Correlation off scores from two equivalent forms of a test
  • Measures stability (over time) and equivalence (construct similarity)
  • Use sample of different times from same domain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Internal Consistency

A
  • One administration
  • One form of instrument
  • Divides instrument and correlates the scores from the different portions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Split-Half Reliability

A
  • Given once then split in half to determine reliability
  • Need to divide instrument into equivalent halves, like even and odd
  • Problem: dividing instrument in half makes number of items smaller —> smaller correlation

Doesn’t work if test increases in difficulty and doesn’t quick fix problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Kinder-Richardson

A
  • KR-20: heterogeneous items
  • KR-21: homogenous items - single construct (cannot be used if items are from the same domain or differ in difficulty)
  • Lower reliability coefficient then split-half
  • Purpose: Estimate the average of all split-half reliabilities from all ways of splitting the instrument
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pearson-Product Coefficient Alpha

A
  • Used for non-dichotomous scoring
  • Ex: Likert scales
  • Cronbach’s alpha
  • Takes into account variance of each item
  • Conservation estimate of reliability
  • Most common
17
Q

Standard Error of Measurement (SEM)

A
  • Provides estimate of range of scores if someone were to take instrument repeatedly
  • Based on idea that if someone takes test multiple times, scores would fall into a normal distribution
18
Q

SEM v. SD

A
  • SD is spread of scores between students
  • SEM is spread of scores for one student
  • Uses same estimations
19
Q

Content-Related Validity

A
  • Test items measure the objectives they are supposed to measure
  • Focus on how content was determined
  • May be based on test creator’s own analysis of topic or expert analysis
  • How well do test items reflect the domain of material being tested
20
Q

Criterion-Related Validity

A
  • Test scores related to specific criterion/variable
  • Sources of criterion scores: academic achievement, level of education, performance in specialized training, job performance, psychiatric diagnosis, ratings by supervisors, correlations with previously available tests
21
Q

Concurrent Validity (Criterion-Related)

A
  • Scores on test and criterion measure are collected at same point
  • Ex: achievement, certification
  • Scorer typically higher than predictive
  • Require reliable and bias-free measures
22
Q

Predictive Validity (Criterion-Related)

A
  • Test is administered first and scores on criterion measure are collected at a later time
  • Ex: SAT, college GPA
  • Require reliable and bias-free measures
23
Q

Construct Validity

A
  • What do scores on this test mean or signify
  • Construct: Grouping of variables that make up observed behavior patterns
  • Ex: Self-efficacy, personality
  • Measured by correlation of 2 scores or factor analysis
  • Often seen in psych tests
24
Q

Convergent v. Discriminant (Construct Validity)

A

-Covergent: Positive correlation with other tests measuring the same/similar construct

25
Q

Threats to Construct Validity

A
  • Too many variables
  • Under-represented: missing measuring parts of construct
  • Extra questions
  • Items are too similar
26
Q

Overall Threats to Validity

A
  • History: outside events during course of test
  • Maturation: natural development with age
  • Testing: repeat testing; changes due to practice
  • Instrumentation: changes in measurement procedures
  • Statistical regression: regression to mean after extreme score first time
  • Interaction: any combo of 2
  • Mortality: drop out
  • Collection of subjects: bias of collecting subjects and assigning to groups
27
Q

Face Validity

A
  • Not legitimate

- Based on appearance of the measure and its test items

28
Q

Types of Evidences

A
  • Test content
  • Response processes
  • Internet structures
  • Relations to other variables
  • Consequences of testing
29
Q

Item Analysis

A
  • Examine and eval each item in the test —> get rid of items that don’t work
  • Done during instrument development or revision
30
Q

Item Difficulty

A
  • Index reflecting proportion of people getting item correct
  • 0.0= no one got it correct
  • 1.0= everyone got it correct
  • 0.5= ideal for differentiation
31
Q

Item Discrimination

A
  • Degree to which item correctly differentiates among test takers
  • Extreme group method: 2 groups - high scores, low scores (works with normal distribution)
  • Correlational method: performance of test v. item
32
Q

Item Response Theory (IRT)

A
  • Focus on each item -considers mathematical relationship between abilities
  • 2 major assumptions: unidimensionality, local independence
  • Most common in testing where there is a right/wrong answer v. preference
  • Models student ability using each question instead of aggregate score
33
Q

Unidimensionality

A

Each item measures one ability or trait

34
Q

Local Independence

A

Unrelated to responses on other items

35
Q

Selecting Tasks

A
  • Determine what info is needed
  • Consider what info is needed
  • Search assessment resources
  • Eval possible instruments
36
Q

Administering Tests

A
  • Pre-testing procedures
  • Administration
  • Scoring: by hand, computer, Internet
37
Q

Communicating Results

A
  • Simple language
  • Individual v. Group
  • Written v. Oral
  • Communicate test’s strengths and limitations
  • Know the manual
  • Describe v. Just report cases
  • Use various results
  • Involve client
  • Encourage asking questions
  • Relate test to a goal
38
Q

Problems with Reporting Result

A
  • Acceptance
  • Readiness of client
  • Negative results
  • Flat profiles and doesn’t show anything
  • Motivation and attitude
39
Q

Communicating Test Results for Parents

A
  • Identifying information
  • Reason for referral
  • Background info
  • Test results and interpretation
  • Diagnostic impressions and summary