Validity Flashcards
Validity
The extent to which a test accurately measures what it is intended to measure
Reliability
The degree to which an assessment tool produces stable & consistent results
Standardisation
Ensures all conditions are as similar as possible for all individuals a who are given the test
Validity; CTT
1st period (1900-1950)= content-related validity (content & face validity)
2nd period (1950-1970)= criterion-related validity (concurrent & predictive validity)
3rd period (1970-current)= construct-related validity (convergent & discriminant validity)
Construct validity
Content, criterion & construct
Content-related validity
Systematic review of the test content to determine if items cover a representative sample of the universe behaviours to be measured & to determine if the choice of items is appropriate & relevant
Global, mostly non-statistical procedure
Content-related validity; panel of experts (raters)
A consultation with experts in area of evaluated construct, analyse the representativeness of items in relation to theory on construct & assist in adequacy of items to the target pop
Average of 5 judges
Agreement rate of 80% expected between judges
Asked to make suggestions of improvement
Content-related validity; pilot study
Procedure seeks to verify whether items have been well understood by target audience
Content validity can be demonstrated statistically by item analysis
Content-related validity; face validity
Not a statistical or numerical technique
Regards whether a test is an apparent measure of its associated criterion
Involves language & layout- the way content is presented
The test should look good to test takers
Content validity procedures
1) setting test goals
2) selection of the universe of behaviours that appear to measure the construct (pool of items)
3) item development
4) item analysis
5) final choice of test items
Criterion-related validity
Extent to which a measure is related to either a present or future outcome
Such evidence is provided by high correlations between a test & well-defined criterion measure (ideal is above 0.5)
Criterion will depend on type of construct evaluated e.g. academic performance or group dynamics
Criterion is a standard on which a test will be compared against
Criterion-related validity; Concurrent-validity
Derived from assessments of simultaneous relationship between year & criterion such as between learning disability test & school performance
Involves determining the current status of a person in relation to some classification scheme, such as diagnostic categories
Criterion-related validity; predictive validity
The extent to which a score on a scale or test predicts scores on some criterion
Construct-related validity; construct validity
The degree to which a test measured what it claims or purports to be measuring
Whether a scale or test measures construct adequately
From the construct validity, the degree to which a person has a certain characteristic is inferred
Failures in the validation process may stem from instrument, it’s administration or the theory
Construct-related validity; convergent validity
Degree to which 2 instruments measuring the same construct are theoretically & empirically related
Ideal correlation is above 0.5
Conceptual confusion
Convergent validity (results) vs criterion validity (technique)
Construct-related validity; discriminant validity
If 2 measures of the same quality show higher correlations, then 2 measures that do not assess the same quality should not
2 tests of unrelated constructs should have low correlations- they should discriminate between 2 qualities that are not related to each over
May also be related to the ability of an instrument to discriminate groups of subjects that have a high magnitude in an attribute against subjects who have low magnitude
Ideal correlation in predictive validity is below 0.3
Also called divergent validity
Statistical techniques for the study of construct validity via CTT
Correlations between tests
Factor analysis- describes variability among observed, correlated variables in terms of unobserved variables (dimensions/factors)
Structural equation modelling- general, powerful multivariate analysis technique that includes specialised versions of a number of other analysis methods as special cases
IRT; validity
Can be used to investigate any type of test, whether measuring abilities or attitudes
Content validity & proper representation of latent trait under measurement is one of primary concerns of IRT, hence development of specifications matrix
Items are developed via TRI in systematic- based in descriptors- rather in an intuitive way as CTT does
The remaining steps for the test development & therefore content validity are similar to those deployed by CTT
Content validity procedures
1) setting goals
2) identification of dimensions & descriptors
3) specification matrix
4) item development
5) item analysis
6) final choice of test items
Criterion-related validity
Not common but adoption of specific stats methods for study of criterion validity, non-linear correlations can offer more precision for correlation tests
Criterion could be behaviour or another instrument
Full info factor analysis
Most modern technique for study of construct unidimensionality
Does not require computation of intercorrelations between items since they are based on individuals pattern of response rather than on correlational structure of the multivariate latent response distribution
These models are therefore ‘full info’ models
To attest to construct unidimensionality, FA oblique rotation technique used with the extraction of 2 factors in order to verify if they’re correlated
Presence of high correlation between first 2 factors indicates that, even if there is a second order factor, only a single latent trait is being assessed, so the assumption of unidimensionality is met
Acceptable correlation is atleast 0.5
Test info curve
Shows for which range of theta levels the test is particularly valid
Items above the validity range are too difficult & below the range, too easy
Other types of validity; cross-cultural validity
Corresponds to the adaptation of psychological instruments to other sociocultural contexts
Precautions while translating & adapting tests
Other types of validity; internal validity
Did the experiment really produce a change?
Other types of validity; external validity
Are the results generalisable?
Stats techniques for the study of construct validity via IRT
1) full info factor analysis
2) test info curve
3) item parameters
Construct related validity: construct validity
The degree to which a test measured what it claims or purports to be measuring
Whether a scale or test measures construct adequately
From the construct validity, the degree to which a person has a certain characteristic is inferred
Failures in the validation process may stem from instrument, it’s administration or the theory