Chap 6: EVALUATING SELECTION TECHNIQUES AND DECISIONS Flashcards by Anne Vallan

The extent to which a score from a test or from an evaluation is consistent and free from error.

Reliability

How well did you know this?

Not at all

Perfectly

method each one of several people take the same test
twice.

test-retest reliability

How well did you know this?

Not at all

Perfectly

The scores from the first administration of the test are correlated with scores from the second to determine whether they are similar

test-retest reliability

How well did you know this?

Not at all

Perfectly

The extent to which repeated administration of the same test will achieve similar results.

test-retest reliability

How well did you know this?

Not at all

Perfectly

The test scores are stable across time and not highly susceptible to such random daily conditions as illness, fatigue, stress, or uncomfortable testing conditions

temporal stability

How well did you know this?

Not at all

Perfectly

The consistency of test scores
across time.

temporal stability

How well did you know this?

Not at all

Perfectly

two forms of the same test are constructed

alternate-forms reliability

How well did you know this?

Not at all

Perfectly

designed to eliminate any effects that taking one form of the test first may have on scores on the second form.

counterbalancing

How well did you know this?

Not at all

Perfectly

The extent to which two forms of the same test are similar

alternate-forms reliability

How well did you know this?

Not at all

Perfectly

A method of controlling for order
effects by giving half of a sample Test A first, followed by Test B, and giving the other half of the sample Test B first, followed by Test A

counterbalancing

How well did you know this?

Not at all

Perfectly

The extent to which the scores on two forms of a test are similar.

Form stability

How well did you know this?

Not at all

Perfectly

consistency with which an applicant responds to items measuring a similar dimension or construct

Internal Reliability

How well did you know this?

Not at all

Perfectly

The extent to which similar items are answered in similar ways is referred to as internal consistency and measures ______

item stability

How well did you know this?

Not at all

Perfectly

The extent to which similar items are answered in similar ways is referred to as _____ and measures item stability

internal consistency

How well did you know this?

Not at all

Perfectly

The extent to which test items measure the same construct.

Item homogeneity

How well did you know this?

Not at all

Perfectly

3 statistics used to determine internal reliability of test

-Kuder-Richardson 20
-Spearman-Brown Prophecy Formula
-Coefficient Alpha (Cronbach’s Alpha)

How well did you know this?

Not at all

Perfectly

A form of internal reliability in which the consistency of item responses is determined by comparing scores on half of the items with scores on the other half of the items.
is the easiest to use, as items on a test are split into two groups.

Split-half method

How well did you know this?

Not at all

Perfectly

are more popular and accurate methods of determining internal reliability, although they are more complicated to compute

Cronbach’s coefficient alpha and the K-R 20

How well did you know this?

Not at all

Perfectly

Used to correct reliability coefficients resulting from the split-half method.

Spearman-Brown prophecy formula

How well did you know this?

Not at all

Perfectly

A statistic used to determine internal reliability of tests that use
interval or ratio scales.

Coefficient alpha

How well did you know this?

Not at all

Perfectly

A statistic used to determine
internal reliability of tests that use items with dichotomous answers (yes/no, true/false).

Kuder-Richardson Formula 20 (K-R 20)

How well did you know this?

Not at all

Perfectly

used for tests containing dichotomous items (e.g., yes-no,
true-false)

K-R 20

How well did you know this?

Not at all

Perfectly

can be used not only for dichotomous items but also for tests containing interval and ratio(nondichotomous) items such as five-point rating scales

coefficient alpha

How well did you know this?

Not at all

Perfectly

The extent to which two people scoring a test agree on the test score, or the extent to which a test is scored correctly.

Scorer reliability

How well did you know this?

Not at all

Perfectly

an issue in projective or subjective tests in which there is no one correct answer, but even tests scored with the use of keys suffer from scorer mistakes

Scorer reliability

When human judgment of performance is involved, scorer reliability is discussed in terms of

interrater reliability

when Evaluating the Reliability of a Test, two factors must be considered:

- the magnitude of the reliability coefficient - the people who will be taking the test.

The degree to which inferences from test scores are justified by the evidence.

Validity

The extent to which tests or test items sample the content that they are supposed to measure.

Content validity

In industry, the appropriate content for a test or test battery is determined by the _______

job analysis

The extent to which a test score is related to some measure of job performance.

Criterion validity

- a test is given to a group of employees who are already on the job. - A form of criterion validity that correlates test scores with measures of job performance for employees currently working for an organization

concurrent validity

- design, the test is administered to a group of job applicants who are going to be hired - A form of criterion validity in which test scores of applicants are compared at a later date with a measure of job performance.

predictive validity

Difference between concurrent and predictive validity

Concurrent - already on the job. Predictive - applicants who are going to be hired

performance scores makes obtaining a significant validity coefficient more difficult

restricted range

the extent to which a test found valid for a job in one location is valid for the same job in a different location

validity generalization (VG)

it is the most theoretical of the validity types

Construct validity

The extent to which a test actually measures the construct that it purports to measure.

Construct validity

is concerned with inferences about test scores

Construct validity

is concerned with inferences about test construction.

content validity

is usually determined by correlating scores on a test with scores from other tests

Construct validity

A form of validity in which test scores from two contrasting groups “known” to differ on a construct are compared.

Known-group validity

is the extent to which a test appears to be job related.

Face validity

True or False face-valid tests resulted in high levels of test-taking motivation, which in turn resulted in higher levels of test performance

true

statements that are so general that they can be true of almost anyone.

Barnum Statements

- A book containing information about the reliability and validity of various psychological tests. - which contains information on over 2,700 psychological tests as well as reviews by test experts.

Mental Measurements Yearbook (MMY)

what edition of Mental Measurements Yearbook (MMY) is used

19th edition

A type of test taken on a computer in which the computer adapts the difficulty level of questions asked to the test taker’s success in answering previous questions

Computer-adaptive testing (CAT)

designed to estimate the percentage of future employees who will be successful on the job if an organization uses a particular test

Taylor-Russell tables

Three information needed by Taylor-Russell Tables

- criterion validity coefficient - Selection ratio (The percentage of applicants an organization hires. hired over applicants) - Base rate (Percentage of current employees who are considered successful.)

- A utility method that compares the percentage of times a selection decision was accurate with the percentage of successful employees. - easier to do but less accurate than the Taylor-Russell tables

Proportion of correct decisions (HIT RATE)

five items of information must be known in Brogden-Cronbach-Gleser Utility Formula

- Number of employees hired per year - Average tenure - Test validity -Standard deviation of performance in dollars - Mean standardized predictor score of selected applicants

- One form of predictive bias - meaning that the test will significantly predict performance for one group and not others.

single-group validity

applicants are rank-ordered on the basis of their test scores

top-down selection

names of the top three scorers are given to the person making the hiring decision - often used in public sector

rule of three

- are a means for reducing adverse impact and increasing flexibility - The minimum test score that an applicant must achieve to be considered for hire.

Passing scores

A selection strategy in which applicants must meet or exceed the passing score on more than one selection test

Multiple-cutoff approach

Selection practice of administering one test at a time so that applicants must pass that test before being allowed to take the next test.

Multiple-hurdle approach

- As a compromise between top down hiring and passing scores, _______ attempts to hire the top test scorers while still allowing some flexibility for affirmative action - A statistical technique based on the standard error of measurement that allows similar test scores to be grouped

banding

The number of points that a test score could be off due to test unreliability.

Standard error of measurement (SEM)

Chap 6: EVALUATING SELECTION TECHNIQUES AND DECISIONS Flashcards

(60 cards)