Lecture 2 Validity & Reliability (Catherine) Flashcards

Question

What are acceptable Reliabilities for Clinical and Research situations?

Answer 1

Acceptable reliabilities for Clinical settings is: r > 0.85 acceptable Acceptable reliabilities for Research settings is: r > ~0.7 acceptable

Answer 2

Internal Consistency of WAIS: r = 0.887 Internal Consistency of MMPI: r = 0.84 Test-Retest Reliabilities of the WAIS: r = 0.82 Test-Retest Reliabilities of the MMPI: r = 0.74 NB: WAIS test-retest is just outside acceptable limits the MMPI is suseptable to change over time as its a personality inventory used on clinical patients whoa re more likely to change over time

Answer 3

The Test Administrator would be assessing a tests Internal Consistency

Answer 4

- Test-Retest Reliability - Alternate Form Reliability - Inter-scorer Relaibility

Answer 5

- Is the test measuring state or trait? (trait is more enduring) - The range of possible responses (ideally 5-7 responses - 0 - 10 is not ideal as people tend to cluster around the middle) - Speeded tests - towards end of test test taker may not have had time to attempt a number of items, does not mean they would have been incorrect, did not get time to answer

Answer 6

- Quality of items (need to be clear, concise, homogeneous) - Ensure consistent testing conditions - Reduce Test-Retest time intervals - Longer assessments - Develop a robust scoring plan - Test items for reliability & adapt the measure - Ensure Validity

Answer 7

- Internal Validity - External Validity - Test Validity

Answer 8

Confidence in making causal statements about study outcomes

Answer 9

Confidence you can generalise results to people outside of the study

Answer 10

Confidence that what you are measuring truly represents what you think you are measuring

Answer 11

1. Content Validity 2. Criterion-Related Validity 3. Construct Validity

Answer 12

1. Content Validity 2. Criterion-Related Validity 3. Construct Validity

Answer 13

1. Scrutinise test contents 2. Comparing Scores on this test to other tests 3. Perform an analysis of how scores on this test relate to scores on other tests and theories

Answer 14

Face Validity relates to whether the person being assessed believes the test appears to measure what is actually being measured

Answer 15

Construct Validity

Answer 16

- indirectly tests some aspect not perceived by the test-taker (e.g. MMPI asks about icecream as part of a personality assessment) - may result in negative consequences such as poor test taker attitude, or disgruntlement - Some tests have low face validity and others have high face validity

Answer 17

Content Validity Scrutinises the test's content

Answer 18

Content validity is concerned with how well does each item on the test measure what it intends to measure? - Tests should capture all aspects of the target behaviour - e.g. for HR test items should directly relate to the job role we are hiring for

Answer 19

-We use a Content Validity Ratio (CVR - Lawshe, 1975) -We ask N number of experts to rate each item to reflect performance on an item in terms of essential, useful, not necessary -We remove items based on % of people who state the item is not necessary If less than 5% say necessary/essential it should be removed

Answer 20

Criterion-Related Validity relates scores obtained on the current test to other test scores or other measures

Answer 21

Criterion-Related Validity is a judgement of how adequately a test score can be used to infer an individual's most probable standing on some measure of interest. The two varieties of Criterion-Related Validity are: *Concurrent Validity *Predictive Validity

Answer 22

Criterion-Related Validity is interested in how well the test item reflect an individual's actual score on the criterion of interest

Answer 23

Concurrent Criterion-Related Validity -The degree to which the score relates to the criterion measure at that time (measure a new test against a gold standard) Predictive Criterion-Related Validity -Degree to which the score relates to a criterion measure in the future (i.e. uses regression to predict a person's future reading ability)

Answer 24

Concurrent Validity is an index of the degree to which a test score is related to some criterion masuree obtained at the same time

Answer 25

``` Is the criterion: -Relevant -Valid & Reliable -Uncontaminated We need to ensure the test we are comparing is relevant, valid, reliable & uncontaminated, hence we use gold standard tests rather than just random other tests ```

Answer 26

By performing a correlation and comparing how well the outcome of the new test compares with the outcome of a well-known, reliable, well-validated test

Answer 27

Predictive Validity is an index of the degree to which a test score predicts some criterion measure (in the future)

Answer 28

By obtaining test scores now, and criterion measure in the future using multiple predictors. -considering the validity coefficient in the context of corresponding issues including -incremental validity and expectancy data NEEDS work - not clear

Answer 29

The Validity coefficient is a correlation that provides a measure of the relationship between test scores and scores on the criterion measure.

Answer 30

Uses more than one predictor - Additional predictors used in ascertaining criterion-related predictive validity should possess incremental validity. - That is the degree to which an additional predictor explains something about the criterion measure that is not explained already by predictors already in use.

Answer 31

Expectancy Data provides useful information to evaluate the criterion-related validity of a test. Using a score obtained from one test or measure expectancy tables illustrate the likelihood that the test-taker will score within some interval of scores (such as pass or fail).

Answer 32

An expectancy table can be created by a scatterplot according to An expectancy table shows the relationship between scores on e.g. high school exams and relationship to university grade

Answer 33

Construct Validity is concerned with how well inferences drawn from a test score relate to current theories or knowledge (i.e. constructs)

Answer 34

1. Evidence of Homogeneity 2. Evidence of Changes with Age 3. Evidence of Distinct Groups 4. Convergent Evidence**** 5. Discriminant Evidence**** 6. Factor Analysis

Answer 35

- How uniform the test is for measuring a single concept - Correlate sub-sections with the whole test score - Item analysis - How important is homogeneity?

Answer 36

Construct Validity executes a comprehensive analysis of how scores on the test: a. relate to other scores and measures b. can be understood within some theoretical framework for understanding the construct that the test was designed to measure

Answer 37

-Unclear directions -Ambiguity in question terminology -Inadequate time limits -Inappropriate level of difficulty -Poorly constructed test items Test items are inappropriate for planned test outcomes -Tests that are too short -Improper arrangement of items -Identifiable patterns of answers -Administration and Scoring -Nature of the Criterion -Bias -Fairness

Answer 38

- Test biased towards a certain population - Implies a systematic variation in results - Slope versus Intercept Bias - Rating error - overcome by ranking - Halo effect

Answer 39

A Criterion is the standard against which a test or a test score is evaluated. *The Criterion needs to be relevant, valid and uncontaminated.

Answer 40

* An adequate criterion is relevant, i.e. it is pertinent or applicable to the matter in hand. * Evidence should exist that supports the validity of the criterion * A criterion should be uncontaminated, so it should be based, at least in part, on predictor measures. If not, validation study cannot be taken seriously. There is no formal test for criterion contamination

Answer 41

- Age, Culture, Gender - Adjustment to scores - is this fair? - Psychometric techniques for reducing adverse impact of unfairness to some groups

Answer 42

The method developed by Lawshe gauged agreement among raters/judges regarding how essential a particular item is. i. e. is the skill or knowledge measured by this item: - essential - useful (but not essential) - unnecessary

Answer 43

If test scores are obtained about the same time that the criterion measures are obtained, measures of the relationship between the test scores & the criterion provide evidence of concurrent validity.

Answer 44

Measures of the relationship between test scores & a criterion measure obtained at some future time provide an indication of the predictive validity of the test, that is, how accurately scores on the test predict some criterion measure

Answer 45

Two types of statistical evidence is used: * the validity coefficient and * expectancy data

Answer 46

The Validity coefficient is a correlation coefficient that provides a measure of the relationship between test scores and scores on the criterion measure. *The Pearson Correlation coefficient is typically used to determine the validity between the 2 measures.

Answer 47

restriction or inflation of range, a key issue being whether the range of scores employed is appropriate to the objective of the correlational analysis

Answer 48

Cronbach & Gleser cautioned against the establishment of such a rule, it simply should be high enough to result in the identification & differentiation of test-takers with regard to target ability.

Answer 49

Incremental validity is the degree to which an additional predictor explains something about the criterion measure that is not explained by predictors already in use. Each measure used as a predictor should have criterion-related predictive validity, possess incremental validity & only be included if they demonstrate something not covered by existing predictors

Answer 50

Expectancy data provides information that can be used in evaluating the criterion-related validity of a test. *Expectancy tables illustrate the likelihood that the test-taker will score within some interval of scores on a criterion measure (e.g. pass/fail)

Answer 51

* Taylor-Russell (1939, 1973, 1974) - seven steps to an expectancy table * Naylor-Shine Tables (1965)

Answer 52

* 7 step procedure was provided * The table can assist in judging the utility of a test by determining the increase over current procedures Limitation: * The relationship between predictor and criterion must be linear * It is difficult to identify the cut off for successful vs unsuccessful using the table

Answer 53

* No need for linear relationship as uses average criterion scores to compare * Obtaining the difference between the means of the selected & unselected groups to derive an index of what the test is adding to already established procedures * Identifies the utility of a test by determining the increase in average score on some criterion measure

Answer 54

With both tables the validity coefficient used must be one obtained by Concurrent Validation procedures

Answer 55

Cronbach & Gleser's Psychological Tests & Personnel Decision (1957, 1965) 1. a classification of decision problems 2. various selection strategies ranging from single-stage to sequential analyses 3. a quantitative analysis of the relationship between test utility, the selection ratio, costing of testing, expected value of outcomes & 4. a recommendation that in some instances job requirements be tailored to the applicants abilities instead of the other way around

Answer 56

Construct Validity is a judgment about the appropriateness of inferences drawn from test scores regarding individual standings on a variable called a construct. *Constructs are observable, presupposed or underlying traits that a test developer may invoke to describe test behaviour or criterion performance

Answer 57

High scorers and low scorers will behave as predicted by the theory NB Construct validity has been viewed as the unifying concept for all validity evidence

Answer 58

* The test simply does not measure the construct | * The theory is sound, but the statistical procedures or their execution was flawed

Answer 59

* The test is homogeneous, measuring a single construct * Test scores increase or decrease as a function of age, the passage of time, or an experimental manipulation as theoretically predicted * Test scores obtained after some event or time (i.e. post-test scores) differ from pre-test scores as theoretically predicted * Test scores obtained by people from distinct groups vary as predicted by the theory * Test scores correlate with scores on other tests in accordance with what would be predicted from a theory that covers the manifestation of the construct in question

Answer 60

Evidence for the construct validity of a particular test may converge from a number of sources, e.g. other tests designed to assess the same or similar construct. Thus, if scores on the test undergoing construct validation tend to correlate highly in the predicted direction with scores on older, more established, already validated tests, this is convergent evidence

Answer 61

A validation coefficient showing little (i.e. a statistically insignificant) relationship between test scores &/or other variables with which scores on the test being construct-validated should NOT theoretically be correlated this provides Discriminant Evidence or discriminant validity

Answer 62

Factor Analysis

Answer 63

* If more than half the panelists indicate the item is essential, it has at least some content validity. * Lawshe recommended that if the amount of agreement observed is more than 5% likely to occur by chance, then the item should be eliminated 1. Negative CVR: fewer than half the panelists indicate essential 2. zero CVR: exactly half the panelists indicate essential 3. Positive CVR: More than half, but not all the panelists indicate essential

Answer 64

That errors in measurement exist in all test. - these errors affect both reliability and validity - the test developers goal is to reduce / minimise error

Answer 65

If a test systematically under predicts or over predicts the performance of a particular group with respect to a criterion, then it exhibits intercept bias. Intercept bias is a term derived from the point where the regression line intersects the Y-Axis

Answer 66

If a test systematically yields significantly different validity coefficients for members of different groups, then it has a slope bias Slope bias is named as the slope of one group's regression line is different in a statistically significant way from the slope of another group's regression line.

Answer 67

A rating error is a judgment resulting from the intentional or unintentional misuse of a rating scale. * Leniency or generosity error (too generous) * Severity error (too harsh) * Central Tendency error (sticks to the middle) * Halo Effect (high ratings in all things due to raters failure to discriminate)

Answer 68

To obtain consistent results that truly reflect the concepts we are trying to obtain.

Answer 69

Fairness, in a psychometric context is the extent to which a test is used in an impartial, just, and equitable way.

Answer 70

The influence of Culture extends to judgements concerning validity of tests and test items

Lecture 2 Validity & Reliability (Catherine) Flashcards

To provide an overview of the content of Lecture 2 (94 cards)