RELIABILITY Flashcards by Ariel IPAD

Q

involves the consistency of the measuring tool: the precision with which the test measures and the extent to which error is present in measurements.

A

RELIABILITY

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

is an index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance.

A

reliability coefficient

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Sources of Error Variance

A

Test construction
Test administration
Test scoring and interpretation

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Sources of Error Variance

item sampling or content sampling

A

Test construction

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Sources of Error Variance

Test administration:

A

test environment
test-taker variables
examiner-related variables

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Test Administration:

The room temperature, the level of lighting, and the amount of ventilation and noise,

A

Test Environment

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Test Administration:

Pressing emotional problems, physical discomfort, lack of sleep, and the effects of drugs or medication.

A

Test-taker variables.

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Test Administration:

Physical appearance and demeanor

A

Examiner-related variables

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Sources of Error Variance

Scorers and scoring systems

A

Test scoring and interpretation

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Reliability Estimates

A

Test-retest reliability
Parallel-Forms and Alternate-Forms Reliability Estimates (coefficient of equivalence)
Split-Half Reliability Estimates
KR-20 AND COEFFICIENT ALPHA FORMULA
Measures of Inter-Scorer Reliability

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Reliability Estimates

Estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test.

A

Test-retest reliability

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

tendencies to act, think, or feel in a certain manner in any given circumstance

A

TRAIT

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Test-retest reliability

When the interval between testing is greater than six months, the estimate of test-retest reliability is often referred to as the __

A

coefficient of stability

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Parallel-Forms and Alternate-Forms Reliability Estimates

What type of reliability estimate is obtained when two different versions of a test are constructed to be parallel?

A

Alternate-Forms Reliability

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Parallel-Forms and Alternate-Forms Reliability Estimates

What type of test reliability exists when the means and variances of observed test scores are equal for each test form?

A

Parallel-Forms Reliability

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Parallel-Forms and Alternate-Forms Reliability Estimates

What are the two ways to obtain an estimate of parallel-forms reliability?
Answer:

A

Administering two test forms to the same group
Considering factors like motivation, fatigue, and practice effects

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Parallel-Forms and Alternate-Forms Reliability Estimates

What is the primary source of error variance in alternate- or parallel-forms reliability?

A

Item Sampling

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Parallel-Forms and Alternate-Forms Reliability Estimates

What is one drawback of parallel-forms reliability testing due to its complexity and cost?

A

It is time-consuming and expensive

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Parallel-Forms and Alternate-Forms Reliability Estimates

What type of reliability can be obtained without developing an alternate test form or administering a test twice?

A

Internal Consistency Reliability

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Reliability Estimates

What reliability estimate is obtained by correlating two pairs of scores from equivalent halves of a single test?

A

Split-Half Reliability

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Split-Half Reliability

What is the first step in obtaining a split-half reliability estimate?

A

Divide the test into equivalent halves

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Split-Half Reliability

What statistical method is used to calculate the correlation between two halves of a test?

A

Pearson r

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Split-Half Reliability

Which formula is used to adjust the half-test reliability in split-half reliability estimates?

A

Spearman-Brown Formula

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Split-Half Reliability

Why is dividing a test in the middle not recommended for split-half reliability testing?

A

It may not create equivalent halves

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

# Split-Half Reliability What are three acceptable ways to split a test for split-half reliability estimation?

1. Randomly assign items 2. Assign odd-numbered items to one half and even-numbered items to the other 3. Divide the test by content and difficulty

# Split-Half Reliability Which formula estimates the reliability of a test when its length is changed?

Spearman-Brown Formula

# KR-20 and Coefficient Alpha Formula What does inter-item consistency measure in a test?

The degree of correlation among all test items

# KR-20 and Coefficient Alpha Formula What term refers to a test that measures a **single trait**, leading to higher inter-item consistency?

Homogeneity

# KR-20 and Coefficient Alpha Formula What term describes a test that measures different factors, leading to lower inter-item consistency?

Heterogeneity

# KR-20 and Coefficient Alpha Formula Which formula is used to determine the inter-item consistency of dichotomous items?

Kuder-Richardson Formula 20 (KR-20)

# KR-20 and Coefficient Alpha Formula Which formula is used to assess the internal consistency of tests with non-dichotomous items, such as Likert scales?

Coefficient Alpha

# KR-20 and Coefficient Alpha Formula Which formula provides the mean of all possible split-half correlations?

Coefficient Alpha

# Reliability Estimates What is another term for scorer reliability, which measures the consistency between different judges or raters?

Inter-Rater Reliability

# Measures of Inter-Scorer Reliability What does inter-scorer reliability assess in a test?

The degree of agreement between two or more scorers, judges, or raters

What statistical value is used to measure inter-scorer reliability?

Coefficient of Inter-Scorer Reliability

# Nature of Tests What type of test items measure a single ability or trait and have a high degree of internal consistency?

Homogeneous Items

# Nature of Tests What type of test items measure multiple abilities or traits, leading to lower internal consistency estimates?

Heterogeneous Items

# Nature of Tests Which type of test items typically result in a high internal consistency reliability estimate?

Homogeneous Items

# Nature of Tests For which type of test items is test-retest reliability a more appropriate measure than internal consistency?

Heterogeneous Items

# Nature of Tests What type of characteristic is presumed to be ever-changing due to situational and cognitive experiences?

Dynamic Characteristic

# Nature of Tests What type of characteristic is presumed to be relatively unchanging over time?

Static Characteristic

# Nature of Tests A person's mood, which changes depending on experiences and situations, is an example of what kind of characteristic?

Dynamic Characteristic

# Nature of Tests A person's fingerprint, which remains consistent throughout life, is an example of what kind of characteristic?

Static Characteristic

# Nature of Tests What occurs when a test limits the variability of scores, potentially underestimating the true relationship between variables?

Restriction of Range

# Nature of Tests What happens when a test expands the variability of scores, potentially exaggerating the true relationship between variables?

Inflation of Range

Nature of Tests

1. Homogeneity vs Heterogeneity of test items 2. Dynamic vs Static characteristics 3. Restriction or inflation of range 4. Speed tests versus power tests 5. Criterion-referenced tests

it provides an estimate of the amount of error inherent in an observed score or measurement

Standard Error of Measurement (SME) | Also known as the standard error of a score

is a judgment or estimate of how well a test measures what it purports to measure in a particular context

VALIDITY

# VALIDITY Three Categories of Validity:

1. Content Validity 2. Criterion-Related Validity 3. Construct Validity

# VALIDITY Which type of validity is concerned with evaluating the actual content of a test to ensure it represents what it is supposed to measure?

Content Validity

# VALIDITY Which validity category involves comparing test scores with other established measures to determine its accuracy?

Criterion-Related Validity

# VALIDITY Which type of validity focuses on analyzing how test scores relate to a theoretical framework or concept?

Construct Validity

# VALIDITY If a math test is checked to ensure it includes all necessary topics, which type of validity is being assessed?

Content Validity

# VALIDITY If a new depression scale is compared to an existing clinical measure of depression, which type of validity is being examined?

Criterion-Related Validity

# VALIDITY A psychologist analyzes whether a personality test aligns with established personality theories. What type of validity is this?

Construct Validity

# VALIDITY What type of validity refers to how much a test appears to measure what it is supposed to, from the perspective of the test taker?

Face Validity

# VALIDITY What is the term for a judgment about how relevant and appropriate test items seem to be?

Face Validity

# VALIDITY Which type of validity can influence a test taker’s motivation and cooperation based on their perception of the test’s effectiveness?

Face Validity

# VALIDITY What type of validity assesses how well a test represents the entire domain of behavior it is designed to measure?

Content Validity

# VALIDITY What is a method developed by C. H. Lawshe to measure agreement among judges on the importance of test items?

Content Validity Ratio (CVR)

# VALIDITY What type of validity assesses how well test scores can predict an individual's performance on a related measure of interest?

Criterion-Related Validity

# VALIDITY What is the standard against which test scores are evaluated in criterion-related validity?

Criterion

# VALIDITY What error occurs when a rater's knowledge of test scores influences their ratings?

Criterion Contamination

# VALIDITY What are the characteristics of a good criterion?

a. relevant significance b. valid measure c. uncontaminated obsolete

# VALIDITY Two Types of Validity

a. Concurrent Validity b. Predictive Validity

# VALIDITY Which type of validity measures the relationship between test scores and a criterion measure obtained at the same time?

Concurrent Validity

# VALIDITY Which type of validity assesses how well a test predicts future performance on a criterion measure?

Predictive Validity

# VALIDITY What is the correlation coefficient that indicates the relationship between test scores and the criterion measure?

Validity Coefficient

# VALIDITY What type of validity evaluates whether test scores accurately represent an abstract concept or theoretical idea?

Construct Validity

# VALIDITY What is an informed, scientific concept developed to describe or explain a behavior, such as motivation or depression?

Construct

# VALIDITY Which type of validity requires formulating hypotheses about how high and low scorers should behave?

Construct Validity

# Construct Validity Evidence of Construct Validity

- Evidence of homogeneity - Evidence of changes with age - Evidence of pretest–posttest changes - Evidence from distinct groups - Convergent evidence - Discriminant evidence

# Evidence of Construct Validity What type of evidence supports construct validity by showing that a test measures a single concept?

Evidence of Homogeneity

# Evidence of Construct Validity What type of evidence supports construct validity when test scores change with age as expected?

Evidence of Changes with Age

# Evidence of Construct Validity Which type of evidence is based on changes in test scores after an intervention or over time?

Evidence of Pretest–Posttest Changes

# Evidence of Construct Validity What type of construct validity evidence shows that test scores vary as expected among distinct groups?

Evidence from Distinct Groups

# Evidence of Construct Validity Which type of evidence is shown when test scores correlate well with other measures of the same construct?

Convergent Evidence

# Evidence of Construct Validity What type of evidence demonstrates that a test does not correlate with measures of unrelated constructs?

Discriminant Evidence

# TEST BIAS Different Kinds of Test Bias

- Rating Error - Halo Effect - Horn Effect - Contrast Error - Recency Bias

# TEST BIAS 3 TYPES OF RATING ERROR

a. Leniency error or generosity error b. Severity error c. Central tendency error

# RATING ERROR What type of rating error occurs when a rater is overly forgiving in scoring, marking, or grading?

Leniency Error | (Generosity Error)

# RATING ERROR A teacher gives almost all students high grades, even if some did poorly on their tests. What kind of rating error is this?

Leniency Error | (Generosity Error)

# RATING ERROR What type of rating error occurs when a rater is overly harsh in scoring?

Severity Error

# RATING ERROR A supervisor consistently gives employees low performance scores, even if they meet expectations. What kind of rating error is this?

Severity Error

# RATING ERROR What type of rating error happens when a rater avoids extreme ratings and tends to score in the middle of the rating scale?

Central Tendency Error

# RATING ERROR A teacher grades all students between 80-85, even though some deserve a 95 and others a 70. What kind of rating error is this?

Central Tendency Error

# TEST BIAS What is the tendency to give a higher rating than deserved due to the failure to distinguish between different aspects of a person’s behavior?

Halo Effect

# TEST BIAS A manager gives an employee excellent ratings in all areas because they have a friendly personality, even though their actual work performance is average. What kind of bias is this?

Halo Effect

# TEST BIAS What bias occurs when one negative aspect of performance influences all other ratings, resulting in an overall lower score?

Horn Effect

# TEST BIAS A teacher gives a student low marks in all subjects because the student misbehaved in class, even though they perform well academically. What kind of bias is this?

Horn Effect

# TEST BIAS What type of rating error occurs when raters compare individuals to each other instead of evaluating them against performance standards?

Contrast Error

# TEST BIAS A recruiter rates an average candidate poorly because they interviewed right after an exceptional candidate. What type of error is this?

Contrast Error

# TEST BIAS What bias occurs when a leader bases their evaluation on an employee’s most recent performance rather than their overall performance?

Recency Bias

# TEST BIAS A manager rates an employee poorly because they made a mistake last week, even though they performed well throughout the year. What kind of bias is this?

Recency Bias