Assessment L2 - Reliability & Validity Flashcards
What are sources of assessment error?
- Measurement Error
- Imperfect test validity - every test is associated with some error.
- Sampling Error
- Scoring/Admin Errors
- Patient Variables
- Test score = syndrome + measurement error + premorbid ability + drugs + effort + practice
What is measurement error referring to?
Assessment at a particular time is simply a picture in time - a pathology can change, progress over time.
Trait vs state - measurement can be affected by ‘state’ eg. cognitive abilities will be compromised during a depressing state.
What does sampling error refer to?
Error caused by observing a sample instead of the whole population
What does scoring or administration error refer to?
- intra-rater reliability -degree of agreement among repeated administrations of a diagnostic test performed by a single rater.
- inter-rater reliability
What does patient variables refer to?
- task engagement/motivation to perform well
- educational/occupational/cultural/language/age factors
What is the goal of assessment?
- to maximise TRUE SCORE VARIANCE and minimise ERROR.
- individual’s true score is conceptualised to be the average score in a hypothetical distribution of scores that would be obtained if the individual took the same test an infinite number of times.
What are the sources of error in psych testing?
- the CONTEXT - environment, test administrator, score, and reasons for which the test is being taking
- the TEST TAKER - genuine? motivated?
- the TEST ITSELF - unreliable?
What is reliability?
- Consistency in measuring a construct - A TEST IS ONLY AS VALID AS IT IS RELIABLE.
- internal consistency
- test-retest reliability
- alternate forms
What is internal consistency?
how consistent items within the test are at measuring the overall construct - cronbach’s alpha
What are the ‘ground rules’ that are commonly assumed in assessment?
- test administrators and scorers carefully select the appropriate instruments, suitable test environments, establish good rapport with test takers and administer and score the tests in accordance with standardised procedures.
- test takers are also assumed to be properly prepared and well motivated to take the tests.
What reliability considerations should you make when choosing a test/instrument?
- determine the potential SOURCES OF ERROR
- examine RELIABILITY DATA available on the instruments, and the type s of sample that they were obtained from
- EVALUATE the data on reliability in light of all other attributes - question, normative and validity data, cost and time constraints
- Select the test that promises to produce the MOST reliable scores for the purpose and population at hand.
What is Cronbach’s Alpha?
- A measure of internal consistency, can be described as correlation of the test with itself.
- The alpha coefficient is a function of 2 factors:
1. NUMBER OF ITEMS in a test
2. RATIO OF THE TEST TAKER’S PERFORMANCE ACROSS ALL ITEMS OF THE TEST TO TOTAL TEST SCORE VARIANCE.
What kind of test is cronbach’s alpha used for usually?
More commonly used for questionnaires than the WAIS - because for these, high items won’t be consistent with low items - due to difficulty gradient.
What is cronbach’s alpha conceptually meant to represent?
It is conceptually meant to be the estimate of the reliability equivalent to the average of all of the possible split half coefficients that would result from all possible ways of splitting the test in half.
What kinds of reliability methods are prone to practice effects?
Test re test and alternate form reliability
not
split-half and coefficiant alpha
What does a cronbach alpha of less than 0.7 mean?
Low reliability - suggests that the scores one derives from a test may not be very trustworthy.
What cronbach alpha level is said to be ACCEPTABLE??
Possibly over 0.7, 0.8 to be safe.
What is validity.
Simply put - the extent to which a test measures what it is supposed to measure.
Problems of this definition.
- Validity is a property of tests, not their score interpretations.
- To be valid, a test score should measure some construct directly
- Score validity is to some extent, a function of the test author’s or developers understanding of whatever the construct they intend to measure.
eg. Weschler tests are measuring what he conceptualised intelligence to be.
What are the 5 Types of validity?
- Content
- Concurrent
- Predictive
- Construct
- External
what is face validity?
- the superficial appearance of what the test measures from the perspective of the test taker/any naive observer.
- not ocnsidered a true form of validity.
- usually used to gather evidence that a particular test measures the construct it purports to measure by establishing high correls between it and with other existing instruments taht are meant to assess teh same construct.
what is an example of a test that is reliable but not valid.
- to be valid, a test must measure what they’re designed to measure.
- eg. a test that has items like ‘i like to push people around,’ and ‘some people find me threatening’ is not valid if it’s meant to be measuring sociability.
Describe content validity
- it is the degree to which a test measures what it was originally designed or intended to measure
- eg. IQ tests - does the IQ test actually measure IQ rather than something else such as motivation to succeed in school ??
What is concurrent validity?
Degree to which a test can serve as a substitute for another longer, or more costly test. (concerned with convenience)
eg. block design test is used to detect brain damage, and generates similar results to similar neurological tests that are more costly and dangerous.
what is predictive validity?
The degree to which a test accurately predicts what it was originally developed to predict. (concerned with predictive power)
- eg. SAT scores are designed to predict grades and graduation from college.