All psychometric tests Flashcards
What does the PPVT-5 measure?
- measures recptive vocabulary
- assess the knowledge of broad range of vocabulary words
Where can the PPVT-5 be applied?
- screeing for recpetive language disorders
- screening for preschool children
- understanding reading diffculties
- assess word knowledge
What are the two rules applied in the PPVT-5? And can you define these rules?
Basal rule: achieved when an individual has reponded correctly on the first three items
Ceiling rule: achieved wehn an indiviudals has responded incorrectly to six consecutive items
What are the different scoring steps in the PPVT-5?
- Calculate the raw score
- Convert the raw score into standard score
- Determine the confidence interval
- Determine percentile ranks
- Determine normal curve equivalents
- Determine statine
- Determine test age equivalents
- Determine growth scale value
What can be said about the reliability and validity of the PPVT-5?
- excellent split half reliability
- good test-retest reliability
- correlated with relevant variables (language impairmant as example)
What are limitations of the PPVT-5?
- only recpetive knowledge of vocabulary
- educated guess allowed (25% chance)
- normed on individuals with english as dominant language
- does not measure If the individuals can also RETRIEVE a word (just reception)
(Should not be the only criterion for a diagnosis)
What is the purpose of the DVT?
Assessment of attention during rapid visual tracking through accurate selection of target stimuli.
How do you score the DVT?
Total Time = t(page 1) + t(page 2)
Total error = e(page 1) + e(page 2)
What two kind of error can occur in the DVT?
Omission error: did not cross out a six.
Comission error: crossed out a wrong number.
What can the DVT be used for?
Drug use, physical agents, substance abuse or cardiovascular disease, schizophrenia, stroke rehabilitation.
What can be said about the reliabiliy and the validity of the DVT?
- validity evidence from experimental studies
- high test-retest reliability for time
- alternate form reliability for time (around r =.90)
- ONLY moderate test-retest reliability for error (r =.66).
What can be said about the norming of the DVT?
- DVT has been standardized and normed on two adult samples.
- BUT no norms are provided.
What are limitation of the DVT?
- practice effects are not wel enough studied
- time and error correlation is low
- norm files are not included
What are the 5 domains of the NEO-PI-R?
- Openess
- Conscientiousness
- Extraversion
- Agreeableness
- Neuroticism
What are some applications of the NEO-PI-R?
- Bahavioral medice and health psychology
- Psychological research
- In counseling, clinical psychology and psychiatry
- Vocational counseling and industrial/organizational psychology
What are exclusion criteria of the NEO-PI-R?
- test should not be scored if 41 or more reposnes are missing
- the validity should be checked (by Acquiescence, Nay-saying, Random Responding as example)
How do you calculate the scores for the NEO-PI-R?
- Calculate the facet raw scores
- Calculate the domain raw scores
- Profile forms: conversion of raw scores into standard scores.
What can be said about some of the psychometric properties of the he NEO-PI-R?
- internal consistency around .9 for domains
- also high test-retest reliability.
- Factor structure important measure (confirms the idea of 5 domains)
What are some limitations of the NEO-PI-R?
- Social desirability bias
- accuracy of self-description in S-form (depending one one’s self-concept)
(Not used ALONE as a diagnosis
What id the BDI-2 - How many items and what is measured?
Is a 21 item self-report instrument for measuring the severity of depression in adults and adolecents aged 13 years or older.
How is the BDI-2 scored?
- sum rating for the 21 items
- a 4-point sclae ranging form 0 to 3
- hence, maximum score of 63
When looking at the results - how can the BDI-2 be interpreted?
0-13 minimal depression
14-19 mild depression
20-28 moderate depression
29-63 severe depression.
What can be said about the reliability of the BDI-2?
-Test-retest reliability: .93 HIGH (scores are consistent over time)
- Mean Internal consistency: .92 for clinical and .93 for non-clinical HIGH
- HIGH Construct validity: related to Beck Hopelessness Scale
What can be said about the validity of the BDI-2?
- HIGH construct validity: Correlations with other measures of depression
- related to Beck Hopelessness Scale (as example)
What are some of the limitations of the BDI-2?
- Different performance of the test in different settings (no generalizability)
- Self-report nature (Resuls are affected by social desirability).
What does the APM test look at?
- looks at he eductive ability
- yield’s information on poeple’s ability to forge new, largely non-vernal insights
What four different errors are there in the APM test?
- incomplete solution
- arbitrary lines of reasoning
- Over-determined choices
- Repetitions
What can be said about the reliability of the APM?
- Test-retest reliability .91 (for adults), lower (.86) for 11, 5 year olds.
- Internal consistency HIGH - Set 2 yields from -83 - .87 (Split-half reliability)
What can be said about the validity of the APM test?
- turned out to be one of the best single measures of g avaible
- although a measure for eductive ability
What are some limitations of the APM?
- laerning effect (repeated testing)
- lack of ecological valdity
- over-emphasis of visual-spatial intelligence
- not cultural fair
- performance pressure and anxiety.
What does the Quality of Life Inventory (QOLI) look like?
- measures quality of life (broader perspective)
- not limited to negative aspects
- 16 areas of life
What is special about the Scoring of the QOLI?
- satisfaction ranges from -3 to 3
- importance: 0- 2
(calculate weighted satisfaction by multiplying importance x satisfaction) - important areas higher scores.
Psychometric Properties QOLI
(Test-retest reliability)
- only okayish
- BUT: test-retest reliability for such tests lower because you want treatments to work and then the scores will hopefully change
Limitations QOLI
- no representative standard & clinical sample
- significant difference found for ethnic and race groups.
- self-report bias - possibly better of as “perceived” happiness, due to self-report
What is the SON-R 2-8
- general intelligence test for kids 2-8
- written/ spoken language not required
- six subtests (performance vs. reasoning based)
What are the six subtests of the SON?
- Puzzles,
- Categories
- patterns
- situations
- mosaics
- analogies
What is important to know when administering the SON?
- present the test one by one
- you can provide feedback
- you don’t start with first item but rather with the one matching the age of the child (ascending item difficulty)
How does the scoring work for SON?
- Correct, incorrect, refusal
- sum up points earned for each of the six subtests.
- then look for respective IQ score.
SON psychometric properties
- test-retest reliability: good! (.8)
- overall reliability: very good (.9)
Limitations of the SON
- Cultural Bias!!
- not applicable to kids with visual impairment
- very small norming sample, with only a few kids per age group (as there are a lot of age intervals)
- rather complex
What is the SCID?
- semistructured interview guide for making the major DSM-5 diagnosis.
- based on DSM-5
Purpose of the SCID
- Diagnostics (Evaluation)
- Research - Study population Selection
- characterize study population
Administration of the SCID
- Overview: open-ended overview, collecting info “Anamnesebogen”)
- Structured interview guide: going through the sections of the DSM to come to a diagnosis
- Summary Score Sheet
Specialities SCID
- based on self-reports
- BUT an external rating as the clinician is rating!
- you can skip items if certain items are answered with NO –> just follow the instruction in the manual
Psychometric Properties SCID
(efficiency, reliability, validity, objectivity)
- “gold standard” in diagnosis of DSM-5 disorders
- objectivity: good, standardized questions
- interrater-reliability: depending on rarity of disorder (rare = inconsistent), check for disorders!
- validity: depending on disorder up to 100%!
Limitations SCID
- categorical approach (boundary cases)
- interrater-reliability is low (especially for rare disoders)
- not culturally fair
- results influenced by memory bias of patients
- stress/ anxiety inducing
What does the AMI do?
- measures Achievement motivation
- 170 questions on the 17 dimensions
Scoring of the AMI
- Assess overall position of scoring
- assess three major clusters of motivational factors
- look at the particularly high and low individual facets
Psychometric Properties AMI
- Test-retest reliability is EXCELLENT! (.94)
- internal consistency is EXCELLENT (.96)
Limitations AMI
- self-reporting bias and social desirability bias, cultural bias
- are the 17 dimensions even really all measuring this (some are contradictory)
- ülimited scope (situational factors)
What does the STAXI-2 measure?
- measures the experience, expressions and control of anger
- how you feel right now, how i generally feel and how I generally react
Components of Anger Experience
(STAXI)
- State Anger: how angry you feel RIGHT NOW
- Trait Anger: Disposition of how angry you normally feel
- Expression in & out / Control in & out
Looking at the resutlts from normal vs. psychiatric adults - normative samples (STAXI)
- psychiatric patients sign. higher than normal adults on ALL scales
- Interaction effect group x gender (lowest normal F, highest Clinical F)
- psychiatric patients suppress Anger more frequently & less control of outward expression
- Gender differences: males higher scores
- age effect: state and trait anger decrease with age
Test-retest reliability STAXi
- LOW for State Anger –> but that is clear because it is not about stable construct, but rather a highly dynamic one!
- Trait anger around .7 - .8
Limitations STAXI
- little information on the samples
- neglects cultural differences in anger expression (only US individuals)
- very big sample size → not surprising that results are significant!
- Self-report measure: social-desirability; only assesses what person thinks!
Providing feedback in the NEO-PI-R - which T-scores tell us what?
T-score of 56 HIGH
T-score of 45-55 AVERAGE
T-score of 44 LOW
Calculation of the QOLI
- Calculate the weights satisfaction ratings by multiplying importance x satisfaction
- Add up all weighted satisfaction ratings
- Count number of posive and negative weightes satifaction columns (N?)
- Divide total weighted satisfaction by N (the number of areas)
- Hence, create a raw score
- Find your respective T- score (standardized) in the table
How is the SCID rated?
For each diagnostic criterion, the clinician makes the following rating:
- (Absend)
NO (statement is false)
+ (Treshold on a continuum?)
YES (dichotmous statement as clearly true)
How does the AMI testing procedure looks like?
1-7 likert scale (a 4 is neutral)
scales are equally distributed in the questionaire
How does the scoring ook like for the STAXI-2
- percentile ranks and T-scores provided in normative samples
- 25th - 75th normal range
- above 75 - anger interferes with normal functioning.