Chapter 5: Reliability Flashcards

Question 1

Q

this term refers to the consistency of a
measurement

It indicates whether a test yields
stable and consistent results over time and across different contexts

Answer

A

reliability

Question 2

Q

what is reliability determined by?

Answer

A

Reliability is determined by the proportion of total variance in test scores that can be attributed to true variance (i.e., actual differences in the ability being measured). The higher this proportion, the more reliable the test.

Question 3

Q

this type of variance represents actual
differences in ability.

For example, differences in math skills among friends taking a test reflect ______ ______

Answer

A

true variance

Question 4

Q

this term refers to an index that quantifies reliability, expressed as the ratio of true score variance to total variance.

Answer

A

reliability coefficient

Question 5

Q

this type of variance represents variability
due to irrelevant or random factors, such as distractions or fatigue, affecting scores even if the true ability is constant

Answer

A

Error Variance

Question 6

Q

this term encompasses all factors
affecting the measurement process that are not related to the variable being assessed.

Answer

A

Measurement Error

Question 7

Q

this type of measurement error is caused by unpredictable fluctuations, resulting in inconsistencies without a discernible pattern

This type of error can lead to score variability without biasing the results
systematically.

Answer

A

Random Error

Question 8

Q

this type of measurement error refers to a consistent error that may affect scores in a predictable manner, making it possible to identify and correct

This type of error does not compromise the consistency of scores

Answer

A

Systematic Error

Question 9

Q

this source of error variance is seen during test construction and arises from differences among test items, both within a single test and across different test

Answer

A

Item Sampling/Content Sampling

Question 10

Q

what are the three main sources of error variance?

Answer

A

Test Construction
Test Administration
Test Scoring and Interpretation

Question 11

Q

this source of error variance refers to the conditions under which the test is administered can impact performance.

Answer

A

Test Environment

Question 12

Q

this source of error variance refers to the subjectivity in scoring, technical issues,
or glitches can lead to inconsistent
results.

Answer

A

Scorers and Scoring Systems

Question 13

Q

this source of error variance refers to factors such as emotional distress, physical discomfort, lack of sleep, or the influence of drugs and medication can affect the test-taker’s performance and focus.

Answer

A

Test-Taker Variables

Question 14

Q

this source of error variance refers to the physical appearance and demeanor of
the examiner, their presence or absence, and their level of professionalism can influence the test-taker’s experience.

Answer

A

Examiner-Related Variables

Question 15

Q

this source of error variance refers to the extent to which the sample population accurately represents the broader population.

Discrepancies in demographics, political affiliation, or other relevant factors can lead to biased results.

Answer

A

Sampling Error

Question 16

Q

this type of reliability estimate measures the consistency of results between different versions of a test designed to assess the same construct

it assesses how well two different forms of a test yield similar scores when administered to the same individuals.

Answer

A

Parallel-Forms and Alternate-Forms Reliability Estimates

Question 17

Q

this source of error refers to issues such as ambiguous wording in questionnaires or biased items can skew results, favoring one response or candidate over another

Answer

A

Methodological Error

Question 18

Q

this type of reliability estimate is defined as a method for estimating the
reliability of a measuring instrument by
administering the same test to the same group at two different points in time.

Answer

A

Test-Retest Reliability Estimate

Question 19

Q

when do we apply test-retest reliability estimate?

Answer

A

it is best used for measuring stable
traits (e.g., personality) rather than fluctuating characteristics.

Question 20

Q

this term refers to the different versions designed to be equivalent but may not meet the strict criteria of parallel forms

Answer

A

Alternate Forms

Question 21

Q

this term refers to the two forms of a test are statistically equivalent in terms of means and variances

Answer

A

Parallel Forms

Question 22

Q

what is the index for parallel and alternate-forms reliability?

Answer

A

coefficient of equivalence

Question 23

Q

what is the method for test-retest reliability?

Answer

A

correlate scores from the same individuals across two test administrations.

Question 24

Q

what is the index for test-retest reliability?

Answer

A

coefficient of stability

Question 25

Q

what are the reliability estimates categorized as external consistency estimates?

Answer

A

test-retest reliability
parallel-forms and alternate-forms reliability estimates

Question 26

Q

what are the reliability estimates categorized as internal consistency estimates?

Answer

A

internal consistency reliability
split-half reliability

Question 27

Q

this type of reliability can be assessed without creating alternate forms or re-administering tests

This involves evaluating the consistency of test items.

Answer

A

Internal Consistency Reliability

Question 28

Q

this type of reliability estimate estimates reliability by correlating scores from two equivalent halves of a single test.

Answer

A

Split-Half Reliability

Question 29

Q

what are the acceptable splitting methods for split-half reliabilty?

Answer

A

Randomly assigning items to
halves
Using odd and even item
assignments (odd-even
reliability).
Dividing by content to ensure
both halves measure equivalent
constructs

Question 30

Q

what are the steps in conducting split-half reliability?

Answer

A

Divide the test into two equivalent halves.
Calculate the Pearson correlation between the two halves
Adjust the reliability using the Spearman-Brown formula.

Question 31

Q

this type of measurement error refers to unpredictable fluctuations that introduce variability without a systematic
pattern.

Answer

A

Random Error

Question 32

Q

this term refers to all factors affecting the measurement of a variable that are not related to the variable itself

Answer

A

Measurement Error

Question 33

Q

this type of measurement error refers to consistent errors that affect scores predictably, allowing for correction once identified.

Answer

A

Systematic Error

Question 34

Q

what are the methods for estimating internal consistency reliability?

Answer

A

KR-20
KR-21
Cronbach’s Alpha

Question 35

Q

when is KR20 appropriate?

Answer

A

when it is a true dichotomous test (true or false)

Question 36

Q

when is KR21 appropriate?

Answer

A

when it is an artificial dichotomy (multiple choice, binary scoring)

Question 37

Q

this reliability estimate measures the degree of difference between item scores rather than similarity

is calculated based on absolute differences between item scores and is less affected by the number of items on a test

Answer

A

Average Proportional Distance

Question 38

Q

when is Cronbach’s alpha appropriate?

Answer

A

when it is a true multiple choice test (Likert Scale)

Question 39

Q

this reliability estimate measures the correlation among all items on a scale from a single test administration,
assessing the homogeneity of the test

Answer

A

Inter-Item consistency

Question 40

Q

when is inter-scorer reliability used?

Answer

A

Frequently used in coding nonverbal behavior.

For example, a researcher may create a checklist of behaviors (like looking downward or moving slowly) to quantify
aspects of nonverbal cues indicating depressed mood.

Question 41

Q

his term refers to the degree of agreement or consistency between two or more scorers regarding a particular measure

Answer

A

Inter-Scorer Reliability

Question 42

Q

how is inter-scorer reliability calculated?

Answer

A

it is calculated using a correlation coefficient, referred to as the coefficient of inter-scorer reliability

Question 43

Q

what number should the coefficient be for it to be considered as highly reliable?

Considered excellent (grade A); crucial for high-stakes decisions.

Question 44

Q

what number should the coefficient be for it to be considered as moderately reliable?

Question 45

Q

what number should the coefficient be for it to be considered as having a low reliability?

Weak, indicating potential issues with the
test’s effectiveness

Answer

A

0.65-0.70s

Question 46

Q

what number should the coefficient be for it to be considered as having an unacceptable reliability?

Answer

A

below 0.50

Question 47

Q

when is a test considered homogenous?

Answer

A

A test is considered homogeneous if it is
functionally uniform, measuring a single
factor (e.g., one ability or trait).

Question 48

Q

these tests are designed to indicate how a test-taker performs relative to a specific
criterion or standard (e.g., educational or
vocational objectives).

these tests focus on measuring whether test-takers meet predetermined criteria rather than comparing their scores to those of others

Answer

A

criterion-referenced tests

Question 49

Q

when is a test considered heterogeneous?

Answer

A

A test is heterogeneous if it measures
multiple factors or traits.

In such cases, internal consistency
estimates may be lower, whereas test-retest reliability might provide a more appropriate measure of reliability

Question 50

Q

what reliability estimates are appropriate for static characteristics?

Answer

A

test-retest or alternate-forms methods

Question 51

Q

these tests consist of items of uniform difficulty (typically low), allowing all test-takers to complete all items correctly within generous time limits

Answer

A

Speed Tests

Question 52

Q

these tests are administered time limit that allows test-takers to attempt all items

Contains difficult items, with the
expectation that no test-taker can
achieve a perfect score

Answer

A

Power Tests

Question 53

Q

this theory is also known as the true score model

is the most widely used model of measurement in psychology due to its simplicity relative to more complex models

Answer

A

Classical Test Theory

Question 54

Q

this term represents the value that genuinely reflects an individual’s ability or trait level as measured by a particular test.

This value is highly dependent on the specific test used

Answer

A

True Score

Question 55

Q

this theory posits that a test’s reliability is
determined by how accurately the test score reflects the domain from which it samples

Answer

A

Domain Sampling Theory

Question 56

Q

this theory suggests that test scores can vary across different testing situations due to various situational factors

Answer

A

Generalizability Theory

Question 57

Q

this term refers to the complete
range of items that could measure a specific behavior, viewed as a hypothetical construct.

Answer

A

Domain of Behavior

Question 58

Q

this term assesses how well
scores from a specific test can be generalized across different contexts.

Coefficients of generalizability represent the influence of particular facets on test scores

Answer

A

Generalizability Study

Question 59

Q

this term evaluates the utility of test
scores in assisting users in making informed decisions.

Answer

A

Decision Study

Question 60

Q

this theory is an alternative to CTT that models the probability of an individual with a certain level of ability performing at a specific level

Often referred to as latent-trait theory because it measures constructs that are not directly observable (latent).

Answer

A

Item Response Theory (IRT)

Question 61

Q

what does discrimination mean for IRT?

Answer

A

discrimination measures the extent to which an item can differentiate between individuals with higher or lower levels of the trait or ability being assessed.

Question 62

Q

what are the key concepts in IRT?

Answer

A

Difficulty and Discrimination

Question 63

Q

what does difficulty mean for IRT?

Answer

A

The attribute of an item indicating
how challenging it is to accomplish, solve, or comprehend.

Question 64

Q

what are the two types of test items?

Answer

A

Dichotomous Test Items
Polytomous Test Items

Answer 63

A

this type of test items have three e or
more possible responses, where only one
response is correct or aligned with the targeted trait or construct

Answer 64

A

confidence interval

Answer 65

A

standard error of measurement

Answer 66

A

standard error of the difference

Answer 67

A

it leads to lower correlation coefficients since the diversity of scores is limited

Answer 68

A

it results in higher correlation coefficients due to a broader spread of scores.