Module 2: Reliability Flashcards

Question

Random Error

Answer 1

source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in measurement process (e.g., noise, temperature, weather)

Answer 2

+ source of error in a measuring a variable that is typically constant or proportionate to what is presumed to be the true values of the variable being measured + has consistent effect on the true score + SD does not change, the mean does

Answer 3

+ Reliability refers to the proportion of total variance attributed to true variance + The greater the proportion of the total variance attributed to true variance, the more reliable the test

Answer 4

Error variance may increase or decrease a test score by varying amounts, consistency of test score, and thus, the reliability can be affected

Answer 5

Rxx (x - [x with the dash on top] + [x with the dash on top] wherein Rxx - correlation coefficient x - obtained score x with the dash on top - mean

Answer 6

time sampling

Answer 7

+ an estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the test

Answer 8

appropriate when evaluating the reliability of a test that purports to measure an enduring and stable attribute such as personality trait

Answer 9

established by comparing the scores obtained from two successive measurements of the same individuals and calculating a correlated between the two set of scores

Answer 10

the longer the time passes, the greater likelihood that the reliability coefficient would be insignificant

Answer 11

happened when the test-retest interval is short, wherein the second test is influenced by the first test because they remember or practiced the previous test = inflated correlation/overestimation of reliability

Answer 12

scores on the second session are higher due to their experience of the first session of testing

Answer 13

items are remembered by the test takers especially the difficult ones/items that we got highlight confused

Answer 14

might inflate the abilities of test takers

Answer 15

test-retest with longer interval might be affected of other extreme factors, thus, resulting to low correlation

Answer 16

lower correlation = poor reliability

Answer 17

problems in absences in second session (just remove the first tests of the absents)

Answer 18

coefficient of stability

Answer 19

Pearson R, Spearman Rho

Answer 20

Item Sampling (Immediate), Item Sampling changes over time (delayed)

Answer 21

+ established when at least two different versions of the test yield almost the same scores + has the most universal applicability + true scores must be the same for two tests + means and the varianes of the observed scores must be equal for two forms

Answer 22

each form of the test, the means, and the error variances are EQUAL; same items, different positionings/numberings

Answer 23

simply different version of a test that has been constructed so as to be parallel

Answer 24

The test should contain the same number of items and the items should be expressed in the same form and should cover the same type of content; range and difficulty must also be equal

Answer 25

The test should contain the same number of items and the items should be expressed in the same form and should cover the same type of content; range and difficulty must also be equal

Answer 26

If there is a test leakage, use the foem that is not mostly administered.

Answer 27

technique to avoid carryover effects for parallel forms, by using different sequence for groups (e.g. G1 - listen to song before counseling, G2 - counseling first, before listening to the song)

Answer 28

technique to avoid carryover effects for parallel forms, by using different sequence for groups (e.g. G1 - listen to song before counseling, G2 - counseling first, before listening to the song)

Answer 29

It can be administered on the same day or different time.

Answer 30

Parallel forms/alrernate forms because test developers create two forms of the test.

Answer 31

There is a difference between the two tests

Answer 32

It may be affected by motivation, fatigue, or intervening events.

Answer 33

It may be affected by motivation, fatigue, or intervening events.

Answer 34

Pearson R or Spearman Rho

Answer 35

Pearson R or Spearman Rho

Answer 36

Inter-Item Reliability

Answer 37

Item Sampling Homogeneity

Answer 38

+ used when tests are administered once + consistency among items within the test + measures the internal consistency of the test which is the degree to which each item measures the same construct + measurement for unstable traits

Answer 39

This is if all items measure the same construct, then it has a good internal consistency

Answer 40

useful in assessing Homogeneity

Answer 41

if a test contains items that measure a single trait (unifactorial)

Answer 42

degree to which a test measures different factors (more than one factor/trait)

Answer 43

more homogenous items = higher inter-item consistency

Answer 44

+ KR-20 + KR-21 + Cronbach's Coefficient Alpha

Answer 45

used for inter-item consistency of dichotomous items (intelligence tests, personality tests with yes or no options, multiple choice), unequal variances, dichotomous scored

Answer 46

used if all the items have the same degree of difficulty (speed tests), equal variances, dichotomous scored

Answer 47

used when two halves of the test have unequal variances and on tests containing non-dichotomous items; unequal variances

Answer 48

measure used to evaluate internal consistencies of a test that focuses on the degree of differences that exists between item scores

Answer 49

Item Sample; Nature of Split

Answer 50

obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered ONCE

Answer 51

it is useful when it is impractical or undesirable to assess reliability with two tests or to administer a test twice

Answer 52

One cannot just divide the items in the middle because it might spuriously raise or lower the reliability coefficient, so just randomly assign items or assign odd-numbered items to one half and even-numbered items to the other half

Answer 53

+ Spearman-Brown Formula + Spearman-Brown Prophecy Formula + Rulon's Formula

Answer 54

allows a test developer of user to estimate internal consistency reliability from a correlation of two halves of a test, if each half had been the length of the whole test and have the equal variances

Answer 55

estimates how many more items are needed in order to achieve the target reliability

Answer 56

multiply the estimate to the original number of items

Answer 57

counterpart of spearman-brown formula, which is the ration of the variance of difference between the odd and even splits and the variance of the total, combined odd-even, score

Answer 58

If the reliability of the original test is relatively low, then developer could create new items, clarify test instructions, or simplifying the scoring rules

Answer 59

Pearson R or Spearman Rho

Answer 60

Scorer Differences

Answer 61

+ the degree of agreement or consistency between two or more scorers with regard to a particular measure + evaluated by calculating the percentage of times that two individuals assign the same scores to the performance of the examinees

Answer 62

a variation is to have two different examiners test the same client using the same test and then to determine how close their scores or ratings of the person are

Answer 63

used for coding nonbehavioral behavior/factors

Answer 64

+ Fleiss Kappa + Cohen's Kappa + Krippendorff's Alpha

Answer 65

determine the level between TWO or MORE raters when the method of assessment is measured on CATEGORICAL SCALE

Answer 66

two raters only

Answer 67

two or more rater, based on observed disagreement corrected for disagreement expected by chance

Answer 68

trait, state, or ability presumed to be ever-changing as a function of situational and cognitive experience

Answer 69

barely changing or relatively unchanging

Answer 70

if the variance of either variable in a correlational analysis is restricted by the sampling procedure used, then the resulting correlation coefficient tends to be lower

Answer 71

when time limit is long enough to allow test takers to attempt all times

Answer 72

generally contains items of uniform level of difficulty with time limit

Answer 73

Reliability should be based on performance from two independent testing periods using test-retest and alternate-forms or split-half-reliability

Answer 74

designed to provide an indication of where a testtaker stands with respect to some variable or criterion

Answer 75

As individual differences decrease, a traditional measure of reliability would also decrease, regardless of the stability of individual performance

Answer 76

+ states that everyone has a "true score" on a test + made up of "true score" and random error

Answer 77

genuinely reflects an individual's ability level as measured by a particular test

Answer 78

+ estimates the extent to which specific sources of variation under defined conditions are contributing to the test scores + considers problem created by using a limited number of items to represent a larger and more complicated construct + test reliability is conceived of as an objective measure of how precisely the test score assesses the domain from which the test draws a sample + Systematic Error

Answer 79

+ based on the idea that a person's test scores vary from testing to testing because of the variables in the testing situations + according to generalizability theory, given the exact same conditions of all the facets in the universe, the exact same test score should be obtained (universe score)

Answer 80

the test situation

Answer 81

number of items in the test, amount of review, and the purpose of test administration

Answer 82

developers examine the usefulness of test scores in helping the test user make decisions

Answer 83

+ the probability that a person with X ability will be able to perform at a level of Y in a test + a system of assumption about measurement and the extent to which item measures the trait

Answer 84

item difficulty

Answer 85

Latent-Trait Theory

Answer 86

+ The computer is used to focus on the range of item difficulty that helps assess an individual's ability level + If you got several easy items correct, the computer will then move to more difficult items

Answer 87

attribute of not being easily accomplished, solved, or comprehended

Answer 88

degree to which an item differentiates among people with higher or lower levels of the trait, ability, etc.

Answer 89

can be answered with only one of two alternative responses

Answer 90

3 or more alternative responses

Answer 91

+ provide a measure of the precision of an observed test score + index of the amount of inconsistent or the amount of the expected error in an individual's score + allows to quantify the extent to which a test provide accurate scores + used to estimate or infer the extent to which an observed score deviates from a true score + Standard Error of a Score

Answer 92

Standard deviation of error

Answer 93

provides an estimate of the amount of error inherent in an observed score or measurement

Answer 94

Higher reliability

Answer 95

Used to estimate or infer the extent to which an observed score deviates from a true score

Answer 96

+ a range or band of test scores that is likely to contain true scores + tells us the relative ability of the true score within the specified range and confidence level

Answer 97

The larger the range, the higher the confidence

Answer 98

can aid a test user in determining how large a difference should be before it is considered statistically significant

Answer 99

refers to the standard error of the difference between the predicted and observed values

Answer 100

If the reliability is low, you can increase the number of items or use factor analysis and item analysis to increase internal consistency

Answer 101

nature of the test will often determine the reliability metric

Answer 102

a) Homogenous (unifactor) or heterogeneous (multifactor) b) Dynamic (unstable) or static (stable) c) Range of scores is restricted or not d) Speed Test or Power Test e) Criterion or non-Criterion

Answer 103

detects true positive

Answer 104

detects true negative

Answer 105

proportion of the population that actually possess the characteristic of interest

Answer 106

no. of hired candidates compared to the no. of applicants

Answer 107

number of hired candidates / total number of candidates / = divided by

Answer 108

1. True Positives (Sensitivity) 2. True Negatives (Specificity) 3. False Positive (Type 1) 4. False Negative (Type 2)

Answer 109

predict success that does occur

Answer 110

predict failure that does occur

Answer 111

success does not occur

Answer 112

predicted failure but succeed

Answer 113

scored well, performed poorly

Answer 114

scored well, performed well

Answer 115

scored well, performed poorly

Answer 116

scored poorly, performed poorly

Module 2: Reliability Flashcards

(140 cards)