Wk 5 - Validity Flashcards
If the validity coefficient between a test and its criterion measure (where a high test score should predict a high criterion score) is -.97 (minus point nine seven) and is statistically significant then this probably indicates… (x1)
Because… (x4)
The test could be reliable but is definitely not valid
The negative correlation indicates that the test has an inverse relationship with the criterion -
When it ought to have a positive correlation if it was valid.
However high correlation suggests the test is probably reliable (if unreliable, correlation would be much closer to zero)
Factors that may affect a predictive validity coefficient do NOT include… (x1)
Because… (x1)
The mean score on the test (assuming no ceiling or floor effects)
As calculating the correlation coefficient involves standardizing the variables anyway
True or false, and why? (x2)
Construct irrelevant variance refers to the variance in a CONSTRUCT that does not covary with TEST scores
False
Construct irrelevant variance refers to the variance in the TEST that does not covary with the CONSTRUCT
(not the other way around, as stated in the question)
True or false, and why? (x2)
If the variances of a test and the construct it is attempting to measure only overlap by a small degree then the test is likely to have low reliability.
False
If variances of test and the construct it’s measuring only overlap by a small degree then the test is likely to have low VALIDITY
(not RELIABILITY as it says in the question)
True or false, and why? (x4)
Non-random attrition between two time points in a longitudinal validation study is one of the factors that could potentially compromise the evaluation of the CONCURRENT validity of a test (assuming the test is administered during the initial time point).
False
CONCURRENT validity involves administering both test and criterion measures AT SAME TIME
So if people drop out after the initial time point it won’t matter -
We already have all the data we needed
(though it may affect any evaluation of the test’s PREDICTIVE validity)
True or false, and why? (x2)
In a validation study for a behavioural measure, you discover that self-selection biases in your sample are influencing the spread of scores for the measure. This could compromise the evaluation of the CONCURRENT validity of the test.
True
Anything that affects the spread of scores in a test may affect its correlations with other variables
(which is what we’re analysing when we evaluate the concurrent validity of a test)
True or false, and why? (x1)
Evaluating a test by seeing if it does not correlate highly with a construct it is not supposed to be measuring is an example of deviating validity.
False.
It’s an example of discriminant or divergent validity
True or false, and why? (x2)
A factor analysis involves mathematically grouping items according to the similarity of their content.
False
Factor analysis involves mathematically grouping items according to their inter-correlations
Not similarity of content (there’s no way the factor analysis can “know” what the content of the items is)
True or false, and why? (x1)
If a test has poor face validity then this may have implications for the data that the test yields.
True
Poor face validity can lead to things like missing data
True or false, and why? (x2)
Content validity is not important for a university examination as long as that examination is supported by empirically-based validity evidence
False.
Even if we could create a test that discriminated between good and poor students in the course (i.e. it had empirically-based validity),
Still a problem if it did not do this by measuring knowledge of course content directly.
True or false, and why? (x1)
You can test the incremental validity of a test by seeing whether it can predict some relevant criterion measure in isolation from other measures.
False
Incremental validity is about whether a test contributes to predicting some outcome IN ADDITION TO the effect of other measures
True or false, and why? (x2)
If we had an established intervention known to reduce state anxiety then we potentially could use this to test the validity of a new measure of state anxiety.
True
We can use intervention as part of experiment to see if intervention reduces scores in the new test in ways we’d predict if test is valid
(compared with some placebo intervention)
True or false, and why? (x2)
It is possible for a test to have excellent reliability but poor validity.
True
You don’t need validity to have good reliability
(your test can be consistent in the scores it produces without measuring what you want it to)
True or false, and why? (x3)
It is possible for a test to have excellent validity but poor reliability.
False
Need good reliability for chance of test being valid
(because the level of reliability places a ceiling on how high your validity coefficient can be).
If your measure is producing wildly inconsistent scores then it’s probably not measuring anything
True or false, and why? (x3)
If the reliability of both a test and a criterion measure are high then this means the correlation between them should also be high.
False
If reliability of both a test and a criterion measure is high then the correlation between them is not restricted –
However, this doesn’t mean it can’t be small
(the correlation can be high or low, depending on the validity of the test)
True or false, and why? (x2)
Content validity is empirical data supporting the hypothesis that the content of a test is appropriate
False
Content validity involves opinions
And is not generally based on empirical data.
When students complain that, in a course examination, a lecturer did not ask any questions from a particular lecture, they are effectively complaining about the examination having… (x1)
Potentially poor content validity
Why is it not strictly accurate to talk about the validity of a test (hint: one test could be used in more than one context)? (x3)
It’s interpretations of test scores required by proposed uses that are evaluated, not the test itself.
When test scores are used or interpreted in more than one way, each intended interpretation must be validated
Because test can be used in different contexts – validity can change
What are constructs? (x2)
Plus egs x4
Unobservable, underlying hypothetical traits or characteristics
That we can try and measure indirectly using tests
Intelligence, anxiety, plumbing skill, speeding propensity
What is contruct underrepresentation? (x1 plus e.g. x1)
The portion of variance in the construct that is not captured by our test
self report assumes insight, into eg being slow or fast drivers, but the chances of perfect insight are very slim – so there’s things you don’t capture in the test
What is construct irrelevant variance? (x1 plus e.g. x 1)
Stuff that’s captured by the measurement, but not part of the construct
Eg speed questionnaire – variance that’s to do with interpreting the wording, social desirability issues, not reflection of speed of driving
What are the similarities (x2) and difference between content and face validity? (x2)
Both are opinion-based, not empirical, but…
Face is how valid the test appears to be, from the perspective of the test taker (usually), while
Content is judgment (usually by experts) regarding how adequately a measure samples behaviour representative of the universe of behaviour it was designed to for