Lecture 5 Flashcards by Nicole Milbrath

What is Validity?

-how well are we measuring what we are supposed to measure?
-total variance of test scores (o2x) = construct of interest (o2ci)+ systematic error of measurement (o2se) + random error of measurement (o2re)
-test scores must be reliable in order to be valid
-validity is about the proportion of variance that can be attributed to construct of interest (validity = o2ci / o2x)

How well did you know this?

Not at all

Perfectly

What are the types of validity?

-face validity
-content validity
-criterion-related validity (multiple subtypes)
-experimental validity
-construct validity

How well did you know this?

Not at all

Perfectly

What is face validity?

-test appears to be assessing what it is supposed to assess
-extent to which a test is subjectively viewed as covering the construct it is supposed to assess
-usefulness: acceptation, cooperation, etc.
-limits: bias, social desirability, etc.
only one that is optional

How well did you know this?

Not at all

Perfectly

What is content validity?

-test covers key aspects of the construct it aims to assess (includes representative sample of target behaviours
-typically assessed by experts (scholars, clinicians, etc.)

How well did you know this?

Not at all

Perfectly

What is criterion-related validity?

-established via comparison of test scores with “objective” criterion assumed to provide some “true” reflection of the underlying construct
-criterion refers to an external measure or source of information that informs us about the real presence of the construct that we want to assess

How well did you know this?

Not at all

Perfectly

What are the 2 categories in the first subtype of criterion-related validity?

-concurrent: criterion is administered simultaneously
-predictive: criterion is administered later (selection - problems)

How well did you know this?

Not at all

Perfectly

What are the 4 categories in the second subtype of criterion-related validity?

-congruent: criterion assesses the same construct
-convergent: criterion assesses a related construct (different construct assumed to be related)
-discriminant: criterion assesses a construct known to be: (a) opposite of target construct (negative correlation); (b) unrelated to target construct (no correlation)
-discriminative: criterion is categorical (aim is to predict group membership) [do scores on the test differ between groups of people]

How well did you know this?

Not at all

Perfectly

What are 2 ways criterion-related (concurrent or predictive) discriminative validity can be assessed?

-Mean comparisons
-Chi-Square

How well did you know this?

Not at all

Perfectly

What are mean comparisons?

-groups (serving as the criterion) are compared based on their test scores treated as continuous variables (norm-referenced or not)
-ex: compare engineers and musicians scores on a test of “musical abilities” using a t-test

How well did you know this?

Not at all

Perfectly

What is Chi-Square?

-groups (serving as the criterion) are compared based on their test scores treated as categorical variables (criterion-referenced).
-ex: compare frequency of individuals receiving a diagnosis of bipolar disorder based on test scores in a group of psychology students and a group of psychiatric patients

How well did you know this?

Not at all

Perfectly

In which types of criterion-related validity do we expect a strong positive correlation and which formula is used?

-congruent and convergent [predictive]
-r2xy = o2ci / o2x (squaring the correlation between test and criterion measure –> rough indicator of how much of the total variance in your test can be attributed to the construct of interest as captured by that specific criterion measure)
-1 - r2xy = o2e(s+r) / o2x (how much error there is in test score in total combining the 2 sources of error [random and systematic error])

How well did you know this?

Not at all

Perfectly

In selection procedures, which type of validity do we look at and using what method & formula?

-in selection we work with validity that is predictive [congruent and convergent]
-in selection, validity is assessed using regression (rather than a correlation [because it is unidirectional]).
-Y’ = a + b(X)
-there is always a discrepancy between the observed score on the criterion (Y) and the score that is predicted (Y’) based on the test scores (X), unless validity is perfect (which never happens).
-the difference, or discrepancy, is called the Prediction Error

How well did you know this?

Not at all

Perfectly

What is the standard error of the estimate?

-exactly how much prediction error is there on the average when we use scores from the test to predict the outcome

How well did you know this?

Not at all

Perfectly

What are the 2 ways the standard error of the estimate can be calculated?

-first method (long): (1) the prediction residuals are estimated: Y’ - Y; (2) the standard deviation of these residuals represent the standard error of the estimate.
-second method (short): √(1 - r2 xy) * oy (y=score on the criterion) [just squaring to total variance attributed to measurement error; cue card 11]

How well did you know this?

Not at all

Perfectly

How do we calculate Confidence interval?

-CI for the predicted score on the criterion (e.g., success on the job):
-Y’ = a + b(X) [+/- (z)(standard error of the estimate)]
-Y’ = a + b(X) [+/- (1.96 or 2.58)standard error of the estimate]

How well did you know this?

Not at all

Perfectly

How do we verify the efficacy of a selection process?

Study These Flashcards

-we rely on analyses of sensitivity and specificity
A: true positive [efficient; selected] B: false positive [not efficient; selected]
C: false negative [efficient; not selected] D: true negative [not efficient; not selected]

What are all the factors that are used to describe the efficacy of a selection process?

Study These Flashcards

-sensitivity = A/(A+C)
-specificity = D/(D+B)
-positive predictive power (PPP) = A/(A+B)
-negative predictive power (NPP) = D/(D+C)
-percentage of correct classification: A+D/A+B+C+D (all of the true/all participants)
-base rate = (A+C)/(A+B+C+D)
-selection rate = (A+B)/(A+B+C+D)

What is sensitivity?

Study These Flashcards

-proportion of cases presenting the characteristic that are correctly identified = A/(A+C) [ability to identify true cases].

What is Specificity?

Study These Flashcards

-proportion of the cases not presenting the characteristic that are correctly identified = D/(D+B) [ability to exclude true non-cases]

What is PPP?

Study These Flashcards

-the chances that an individual who receive a positive diagnosis on your measure really suffers from that problem = A/(A+B)

What is NPP?

Study These Flashcards

-the chances that an individual who receive a negative diagnosis on your measure really does not suffer from that problem = D/(D+C)

What is base rate?

Study These Flashcards

-proportion of cases presenting the characteristic: (A+C)/(A+B+C+D)
-the higher the base rate, the lower the sensitivity. Because people presenting the characteristic will be excluded anyway (not enough room to admit them all)
-the lower the base rate, the lower the specificity. Because people not presenting the characteristic will need to be selected to meet the selection quotas (to fill the positions).
everyone who can do it

What is selection rate?

Study These Flashcards

-proportion of cases that are selected: (A+B)/(A+B+C+D)
-when it is very high, there is no need for any selection (because you have enough room for everyone)
-the higher the selection rate, the lower the specificity (people not presenting the characteristic will be selected to fill the quotas)
-the lower the selection rate, the lower the sensitivity (people presenting the characteristic will be excluded as there is not enough room)
how many people are you filling

What is the random selection protocol?

Study These Flashcards

-using base and selection rate
-calculate how much is left for the 2nd column and 2nd row by subtracting BR and SR from total sample size
-for A, look at the SR and multiple that by the number of sample from BR (same thing for B but using SR and the total of the really not efficient column)
- OR for B and C, subtract A from SR and BR; and subtract for D as well

Calculate all the factors of the efficacy of a selection process and the random selection protocol for the following: A: 20 B: 35 C: 15 D: 25.

BR: 37% SR: 58% Sensitivity: 57% Specificity: 42% Correct classification: 47% PPP: 36% NPP: 63% Random selection protocol: A: 20.26; B: 34.74; C: 14.74; D: 25.26

What comparisons can be made in the selection process?

-more than one selection process can be compared with one another (a new protocol is contrasted with the previous one) and with a random process -various selection processes of a similar level of efficacy can be compared in terms of costs-benefits

When do we favor specificity over sensitivity?

-when there is a risk associated with the selection of a non-case: --stigmatization or other risks associated with the intervention; --failure to perform adequately caries high risk (working in nuclear plants)

When do we favor sensitivity over specificity?

-when there is a risk associated with the failure to identify people truly presenting the characteristics: --identification of AIDS among surgeons --ebola virus --the risk associated with non-participation is high for people presenting the characteristic (screening for the risk of schizophrenia, cancer) --rarity of proper candidates

What is experimental validity?

-the degree to which test scores change as a function of experimental (or age-related or external) changes that should modify them (based either on theory or prior research) -e.g., assessing states of consciousness

What is construct validity?

-the degree to which a test measures what it claims to be measuring. -it is the "ultimate" form of validity that is demonstrated based on the accumulation of evidence related to the previous forms of validity + factor analysis (factor validity) [additional source of evidence]

What is factor analysis?

-aiming to assess whether the test items do indeed provide a proper representation of the various components of the construct that the test seeks to measure -it may be presented in a structure matrix separated into different factors (and further split into positive and negative behaviours) -confirms if the items we wanted grouped are truly grouped

How is a factor analysis like a correlation matrix?

-factor analysis is similar to correlation matrix but it looks at covariance and looks at which main components can be extracted and if these components match your expectation, then its evidence of factor validity

What factors influence the validity of test scores?

-whether construct really exists or whether construct has been properly defined and conceptualized in the first place -reliability and validity of the criterion test **ADD** -reliability of the test (lower reliability = lower validity) -nature of target population and of the validation population -range of scores and preselection -time -shape of the relation: U, non-linear; characteristics that may be necessary but not sufficient [graph: __|---

Lecture 5 Flashcards

(33 cards)