Final Exam Flashcards
A depression test has been shown to have strong association with current levels of anxiety. This is an example of which form of validity?
Concurrent Validity
Correlation coefficients range from ____ to _____
-1 to +1
The use of meta-analysis to assess evidence of a test’s adequacy and appropriateness for use in multiple situations and settings is call a _____ study
Validity Generalization
The ____ the error, the _____ the reliability
Higher, lower
The variance between two measurements is attributable to the method used for measurement is:
Shared method variance
Ratio Scales have a _____ while Interval Scales do not
True Zero
Estimates of reliability such as the alternate forms method and the split-half method require that the tests be ____
Parallel
A test in which all items are keyed in the same direction is most vulnerable to _____
Acquiescence bias
A test that compares a test-taker to a reference sample is what kind of test?
Norm-referenced
A depression test has been shown to predict future life satisfaction in multiple research studies. This is an example of what kind of validity?
Predictive validity
According to _____ validity, the structure of a test should match the theory behind that construct
Structural validity
______ reliability is most appropriate to assess reliability in a test that that measures traits that are not expected to change from one testing to the next
Test-retest reliability
Other factors that are irrelevant to the construct and are affecting the results of it
Construct irrelevant variance AKA construct contamination
Universal design in test construction refers to facilitating _____ for all test takers in the population
Accessibility
This type of validity gives you the ability to make future predictions from the resulting measurement
Criterion Validity
This type of validity ensure that you are adequately measuring the construct you intend to measure
Construct validity
A range of values containing the true score
Confidence interval
Correlation values are _____ whereas covariance values are not
Standardized
Data that consists of categories
Nominal Data
Data that can be placed in a specific order
Ordinal Data
Rewording items to be more neutral and providing a distraction-free testing environment can help reduce _____ bias
Social desirability
Correlation between two measures that is consistent with theory/expectations (a form of associative validity)
Convergent validity
The three measures of central tendency
Mean, median, mode
Agreement across observers or coders shows ______
Inter-rater reliability
Findings that can be generalized beyond the study and sample have _______ validity
External validity
The process of quantifying variables for the purpose of measuring their occurrence, strength, and frequency
Operationalization
Fairness in testing is an issue of (reliability or validity?)
Validity
A number between -1 and +1 representing the linear association between two variables
Correlation coefficient
The majority of constructs psychologists study have a (relative or absolute) zero
Relative Zero
The degree to which individual scores remain consistent over administrations of the same test (or alternate versions of the test)
Reliability
Concurrent Validity
Theory-consistent correlations at the SAME testing
Difference between ratio scales and interval scales
ratio has true zero value while interval does not
Criterion-referenced test
uses a cutoff score to sort people into groups
Universal Design
intentional about how the construct is operationalized, taken, and measured so it’s accessible to the most amount of people
Social Desirability Bias
changing responses in order to appear more socially desirable
Inter-rater reliability
same reliability across raters
Construct underrepresentation AKA construct deficiency
A test does not fully measure a construct; missing important pieces
Construct irrelevant variance AKA construct contamination
A test includes irrelevant factors in the items
Structural Validity
Test structure should match the theory
Factor analysis
uses statistics to identify clusters
Unidimensional
All items correlate (ex: measure of depression)
Multidimensional
All items do not correlate (ex: measure of bipolar)
Response process
Match between the intended process and the process respondents use when completing the measure
Predictive Validity
Theory-consistent correlations at a FUTURE testing
Convergent Validity
A construct’s correlation with other constructs
Discriminant Validity
A construct’s lack of correlation with other constructs
Consequential Validity
Correlation between the intended consequences of the test use, and actual consequences of the test use
Correlation coefficient
an estimate of association/consistency between constructs OR parts of a test
Three ways to evaluate the correlation between two variables
Pearson Correlation coefficient, Spearman’s rho, Kendall tau
Used for measuring internal consistency
Chronbach’s alpha
Alternate forms method
estimate reliability based on consistency of scores across two versions of a test
Test-retest
estimate reliability based on consistency of scores across two separate testings
Used for estimating agreement between two or more measures
Intraclass correlations coefficient
Content validity
How well a test measures a representative sample of subject matter being investigated
Criterion validity
How well a test correlates with an established standard of comparison
Three types of criterion validity
Predictive, concurrent, and retrospective
Construct validity
How well a test measures what it intends to measure
Convergent validity
How well responses on a test relate to responses on a similar test
Two aspects of construct validity
Convergent validity and discriminant validity
Discriminant validity
The degree to which a test diverges from another test that is conceptually unrelated
Reflects how well an assessment instrument predicts an indicator of a given concept
validity coefficient
Multitrait-Multimethod Matrix (MTMM)
Shows correlations among two or more measurement techniques
Four types of measurement scales
Nominal, ordinal, interval, ratio
Continuous variables
has an infinite number of possibles values
Discrete variable
limited number of possible values
Bounded variables
measurement scales with a mathematical boundary
Nominal
used for qualitative variables
Ordinal
rank-order quantitative variables
Interval
consistent intervals w/o true zero
Ratio
consistent intervals w/ true zero
Mean
average
Median
Middle number
Mode
Most frequently occuring
Range
Span
Reliability
consistency/stability of test scores over time
Classical Test Theory
Observed score = true score + error
Test-Retest Reliability
Estimates reliability based on consistency of scores across two separate testings
Alternate forms reliability
Both forms of a test (1) measure the same true amount of a construct and (2) have equal error variance
Conceptual relation between reliability and validity
A test must be reliable in order to be valid, but it does not need to be valid in order to be reliable
Parallel tests
Different forms of a test; used to control for memory/practice effects
Internal Consistency Method
one test at one point in time; each part is treated as a different form
Split-half estimates
– Comparing results from one half of the test to results of the other half
Cronbach’s alpha
– The most widely used method for estimating reliability in psychology
Each item on a test is treated as it’s own test
Validity
– interpretation and intended use of test scores
Content Validity
– How well a test measures a representative sample of a subject matter being investigated
Validity generalization
– A test’s adequacy and appropriateness for use in multiple situations and settings
Construct underrepresentation (aka construct deficiency)
– A test does not fully measure a construct; important pieces are missing
Construct irrelevant variance (aka construct contamination)
– A test includes irrelevant factors in the items
Structural Validity
– Test structure should match the theory
Response Process
– Match between the intended process and the process respondents use when completing the measure
Multi-trait-Multimethod Matrix
– shows correlations among two or more measurement techniques
Concurrent validity
– theory-consistent correlations at the same testing
Predictive validity
– theory-consistent correlations at a future testing
Discriminant Validity
– a construct’s lack of correlations with other constructs
Convergent Validity
– a constructs’ correlation with other constructs
Consequential validity
– correlation between the intended consequences and actual consequences of the test use (pros/cons balance)
Relation between bias and validity
more bias = less validity
Malingering
– “faking bad”; appearing more impaired, distressed, challenged, or disturbed
Extremity bias
– picking answers in the extreme rather than the middle
Acquiescence Bias
– agreeing/disagreeing without considering meaning
Especially problematic when all items are keyed in the same direction
May artificially inflate/deflate correlations
Social Desirability Bias
– changing responses in order to appear more socially desirable
May artificially inflate correlations
Prevention-oriented Strategies (Managing Test Context)
Offering anonymity
Limiting the demands of testing
Leading participants to believe bias can be detected
Detection- and Intervention-oriented Strategies (Managing Test Content)
Items/scales embedded in the measure
Useful for desirability tests, extremity tests, acquiescence tests
Prevention-oriented and Effects-oriented Strategies (Specialized Tests)
Keep it simple Frame items neutrally Use forced-choice formats Introduce random/unsystematic measurement Use balanced scales Introduce guessing penalties
Construct Bias (AKA Measurement Bias)
– Teat may have different meaning for different groups
Predictive Bias
– Test may have a different predictive value for different groups
Reliability analysis
– internal consistency coefficients
Rank order
– item difficulty ranking
Item discrimination index
– item-level discrimination
Factor analysis
– internal structure of the test
Differential item functioning analysis
– probability of answering a certain way based on trait levels
Universal design
– test is operationalized, taken, and measures to it is accessible to the most amount of people
Norm-referenced tests
– compares a test taker to a reference sample
Criterion-referenced tests
– uses a cutoff score to sort test takers into groups
Correlation vs. Covariance
– correlation values are standardized, covariance values are not
Trait vs. state variables
– state variables change very quickly, while trait variables are more stable over time