Reliability and validitiy Flashcards
What is reliability?
- the test measures on and only one thing precisely.
How do we measure reliability?
- it has CONSISTENCY ACROSS ITEMS: All items measure the same thing (internal consistency, alternate forms, split-half reliability)
- it has CONSISTENCY ACROSS TIME: The test measures the same thing every time (test-retest)
- it has CONSISTENCY ACROSS OTHER SOURCES: (e.g. raters-inter-rater reliability)
- GENERALIZABILITY (G-theory): looks at all the different sources as part of the same analysis- looks at the amount of inconsistency de to each source of error.
What is validity?
- whether the test measures what it is supposed to measure
How do we examine validity?
- Evidence from item CONTENT
- Evidence from PROCESS/MANIPULATIONS (E.g. whether the test taker is using heuristics or actually being tested on what the test is trying to measure)
- Evidence from INTERNAL STRUCTURE
- Evidence from RELATIONSHIP TO OTHER VARIABLES: (discriminant- don’t correlate highly with something that the test isn’t measuring, convergent (does correlate highly with something that the test is designed to measure), criterion (concurrent + predictive)
- Evidence from CONSEQUENCES OF TEST USE.
What are the test standards?
- recommendations for using and interpreting test scores
- allows a standard of which to go by
What is validity?
- validity refers to the extent to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests:
What are some way of evaluating validity, according to the Test Standards, 2014?
EVIDENCE (from empirical observations)
THEORY
INTERPRETATION (meaning that the test users derive from them)
USE OF TESTS (outcome to evaluate validity)
What was the criterion-based view of validity?
- that a test is valid for anything with which it correlates
What is the Tripartate view of validity? (1966 Standards)
- CONTENT VALIDITY
- CRITERION VALIDITY (concurrent & predictive)
- CONSTRUCT VALIDITY (convergent & discriminant)
What is the Tripartate + outcomes view of validity?(1985 Standards)
- examines intended or unintended consequences of the test)
What is the 1999 Standards view of validity?
- A unitary form of validity, based on evidence from multiple sources to support an argument for what the rest scores actually mean
How does the 1999 Standards view of validity examine validity? And what was their aim?
A unitary form of validity, based on evidence from multiple sources to support an argument of what test scores actually mean.
- Evidence from the CONTENT of the test
- Evidence from RESPONSE PROCESSES
- Evidence from the INTERNAL STRUCTURE
- Evidence from the RELATIONSHIP TO OTHER VARIABLES
- Evidence regarding the CONSEQUENCES OF TESTING.
What is the 1950’s criterion view of validity?
- a test is used to predict an outcome. How well it predicts this outcome is the validity of the test
- conceptualised as a STATIC PROPERTY OF THE MEASURE: a test is either valid or not valid.
- assumes it will always be used for one purpose.
What are some problems with the criterion view of validity?
- there is not always one obvious criterion variable (e.g. criterion for test of self-control? aggression?)
- some tests are used for DIFFERENT purposes, in DIFFERENT GROUPS e.g. English- reading comprehension test: valid indicator of 6th grade academic achievement, but poor indicator of intelligence of adult migrants from non-English speaking countries)
- validity is dependent on: test purpose and use and characteristics of the test takers
What are the three components of the Tripartate view?
- Criterion validity:
- concurrent: criterion measured as same time as the test administered
- predictive: criterion measured as some time after the test administered - content validity: content of the test is both RELEVANT to domain and REPRESENTATIVE of domain
- Construct validity:
- convergent: concepts that are THEORETICALLY RELATED demonstrate EMPIRICAL R/S
- Discriminant: concepts that are THEORETICALLY UNRELATED show NO empirical r/s.