Assesment - General Flashcards
What is the difference between reliability and validity
Reliability is whether a test consistently produces the same result
Validity is whether is measure what it is supposed to measure
Tests can be reliable (they measure something consistently), but not valid (they don’t measure
what the authors claim or presume).
Tests can can never be valid without being reliable.
Tests have degrees of reliability and validity.
Reliability
What are the two reliability tests?
Reliability is whether a test consistently produces the same result. consistency, trustworthiness and precision; with final measurement VS error.
Test - retreat validity - whether the test can be read ministered to get get a constant result
Internal consistency - how consistent items within a test are at measuring the overall construct. Cronbach’s α - is one good way to measure this.
Validity
Validity is whether is measure what it is supposed to measure.
This is about the test itself. A test can not be valid if it is not reliable, and reliability doesn’t contribute to validity. However there are many facets of Validity.
What are the five facets of validity?
Content - this may include face, but face is not one .. it is just fi the content reflects what is intended
Concurrent - can it be used on place of another, eg can the BDI be used in place of DASS..
Predictable validity (criterion related) - does the criterion accurately predict what it should,. eg IQ and scores and performance at high school.
Construct - does in correlate with other constructs that is should and does it discriminate from other constructs it should..
External - is it generalisable, can it be used in other contexts
What are the main threats to reliability ?
The test itself:
- Measurement error
- Test not being valid
The way test administered is as important as is the test itself.
1.The tester (and bias)
Scoring- inter, intra-reiability
2. The person being tested (motivation, effort, culture, language, education, age)
3. The context.. (eg room, weather, testing conditions, Timing..).
What are the strengths and weaknesses of the DASS
ONE MEASURE - 3 subscales for Stress, anxety and depression.
Created for indication of DAS in a normal population. Additionally adds a normal level**
Standardized for an Australian Population
Excellent Internal consistency (most over .90).
Sub-scales correlate strongly with similar measures.
Designed to measure the ‘state’ of people
- distinguishes well between features of depression, physical arousal, and psychological tension and agitation - which may inform treatment..
- *Does not include suicidality question to identify risk
What are the strengths and weaknesses of the BDI and BAI?
2 separate scales (too specific?)
BDI Has been criticized in regards to whether it is testing state or trait depression. Aims to study state.
US standardization.
designed according to clinical diagnostic criteria.
BDI has questionable test-retest - .48 -.86 (DASS better)
Chronback’s alpha only acceptable. (DASS better)
good convergent validity
Includes Suicidal Question
What some issues with Self-report questionnaires in general?
Memory: implicated in certain conditions
Mood: (Negative Bias in depression may lead to exaggeration) STATE VS TRAIT
Motivation/ Effort: Assumes people are able and willing to report
Malingering: Faking Good, Faking Bad
Social Desirability, and potentially not wishing to inform the tester due to embarrassment
Culture: potential bias against ethnic subgroups of the population
Environment (noise, heat, light)
Life Factors: Tests do not consider external influences (eg stress, grief etc)
Convergence: Unable to determine if there is convergence between self-report and an informants report (close friend or observer)
May be hard to decide on only 4 point likert
What some issues with Self-report questionnaires in general? Self-Awareness
May not have enough personal insight into how they think. (one study showed: evidence)
Assumes that people are able to report their personality accurately (requires self-knowledge)
Self awareness - day to day mood and/or current mood or frequency.
STATE VS TRAIT
What are the four types of validity built into the PAI ?
Inconsistency
Infrequency
Negative Impression
Positive Impression
Inconsistency and Infrequency assess deviation from conscientious responding
Negative Impression and Positive Impression assess impression management
Name the ICN scale and describe in detail?
The ICN scale is the inconsistency scale.
Measures respondent consistency. (NB: may also indicate inattention or not caring or low literacy)
10 item pairs, each with related content.
Pairs correlated with one another, but no overall content.
positive and negative, placed far apart on the test.
For example: I often have trouble sleeping at night, and I rarely have trouble getting to sleep. there are several of these items..
64T < = consistent
65T to 72T = some inconsistency *caution
73T > inattentive or inconsistent responding
Name the INF scale and describe in detail?
The INF is the infrequency scale. It measures random responding, indifference, carelessness, confusion, reading difficulties.
8 items; very low endorsed in normal pop
No theme to the content.
Impact of pathology was minimized. (ie scitphrenia)
For Example: My favourite hobbies are archery and stamp-collecting*. My favourite poet is Raymond Kertezo
59T < = appropriate attention and comprehension
60T to 74T = atypical responding. caution (question)
75T > = invalid
Name the NIM scale and describe in detail?
The Negative Impression Scale : measures negativity to self presentation and exaggeration. ‘Faking Bad’ *compensation claims *cry for help
9 items, that are answered differently by people ‘faking bad’. These are answered worse in clinical populations. yet are *bizzare, non clinical..
For example: I don’t have any good memories from my childhood. Every once in a while I completely lose my memory.
72T < = no impact of negative responding
73T to 91T indicates exageration. *caution +++ Cry for help.
91T > = malingering. invalid.
Name the PIM scale and describe in detail? <
Positive Impression management - measures making favourable impression and free from faults. ‘Faking Good’
9 items, answered differently by normals, and clincial samples. Clinical samples score slightly lower than normals.
Example: I sometimes complain too much. Sometimes I am too impatient.
56T < = no favourable impression making
57T to 67T portrayed self as relatively free form common faults *caution
68T > = invalid / interpret with extreme caution.
Why does Paul Meehl contend that statistical decision making is always better than clinical decision making
Statistical vs clinical significance? (this is on the exam)
Clinical:
cognitive bias (memory can only hold so much information, probability/estimates, process too fast, can not consider the complexity of various parameters and hold all that info in mind at one time)
Lack of uniformity
Variability in training, judgement, experience
Statistical:
Standardised and have norms for comparison
multiple parameters and comparison to each other
empirical findings and data
shown to be better 19/20 times
eliminates human error
good tests have additional validity measures built in (detect malingering)
CAVEAT*** how Much Difference Between Scores is Enough to be Really Different?