Psychological Assessment Flashcards
what does psychometrics measure?
difference in personality, intelligence, and psychological function
what are the seven assumptions of psychological assessments?
-psychological traits and states exist
-psychological traits can be quantified and measured
-test related behaviour predicts non test related behaviour
-tests have strengths and weaknesses
-various sources of error are part of the assessment process
-testing and assessment can be conducted in a fair, unbiased manner
-testing and assessment benefit society
what complex constructs do psychological assessments set out to measure?
mood, intellectual functioning, memory, attitudes
How does psychological compare to medical testing in reliability and validity? (Meyer et al., 2001)
-psychological tests are as, or more, reliable than many biological medical tests
-psychological tests are as, or more, valid than many medical tests
How would you describe an intelligent person in the categories of learning, vocabulary and problem solving?
learning: remembering lots of information and grasping things easily
vocabulary: can find the rights words quickly
problem solving: applying their knowledge to solve real world problems
what roots does psychological intelligence testing have in history?
19th century- pioneers (Galton, Wundt, and Cattell) objectively measured sensory abilities and reaction times
1905, Alfred Binet and Theodore Simon published first modern intelligence scale
WW2- screening intellectual ability of new recruits
David Wechsler (1958) definition of intelligence:
a global concept that involves an individual’s ability to act purposefully, think rationally, and deal effectively with the environment
What is one of the most widely used intelligence scales?
Wechsler Adult Intelligence Test (WAIS). reportedly used by 90% clinical psychologists
What is the most recent version of WAIS?
WAIS-IV (the fourth)
What was WAIS standardised according to ?
2200 people aged between 16 and 90 in the USA
What two things are extended every time the WAIS is re standardised, and what does this mean?
IQ range, and age range. reflects increased longevity but also a recognition that elderly are more commonly referred for testing than other age groups
Why must psychologists resist from deviating the standard instructions of the WAIS?
for the fairest test, because the test norms were obtained using the standard instructions
How is intelligence measured in children ages 6 to 16?
The Wechsler Intelligence Scale for Children (WISC-V)
In what ways are the WAIS and WISC similar?
in structure, subtests, and psychometric properties
How would an elderly raw score be different than young/middle aged individuals in the WAIS?
the scores are age corrected and so the elderly raw score will be lower than younger individuals but if someone is exactly average for their age their IQ would still be 100.
How is the WAIS arranged?
consists of 10 core subtests arranged to 4 higher level indexes with an overall IQ
Which age do subtests have a peak performance?
some subtests including psychomotor speed has a peak performance at 17/18 years whereas other subtests have a later peak performance such as vocabulary
What are the four indexes in the WAIS?
-Verbal comprehension
-Perceptual reasoning
-Working memory
-Processing speed
What skills does the Verbal Comprehension Index measure and how does it do this?
-measures well consolidated verbal material and verbal reasoning. Words are asked to be defined and similarities testing (asking how two words are alike) for abstract reasoning.
-information measures general knowledge and verbal comprehension
What does Perceptual Reasoning measure and how does it do this?
measures perceptual reasoning! using three subtests; matrix reasoning, block design and visual puzzles
What skills does Working Memory measure and how does it do this?
tests the ability to retain and manipulate information.
the digit span test is divided into three sections- digits forward, digits backward, and sequencing
there is also an arithmetic subtest
What skills does Processing Speed measure and how does it do this?
measures psychomotor speed using two subsets; coding and symbol search
What Processing Speed Function is sensitive to almost any form of Cognitive disfunction?
Coding
What indexes are typically impaired following a head injury?
working memory and processing speed indexes
true or false: reliability is a necessary condition for validity
True
What does a reliability coefficient tell us?
how much variability in scores on tests is true variability and how much of it is a measurement error. it also allows us to form confidence intervals on scores
if the reliability coefficient is 0.8, what percentage is measurement error?
20%
What does reliability allow us to do?
quantify the confidence we have in test results and to assess whether differences between an individuals scores are liable to reflect true differences in ability or may have arisen by chance
why are psychologists warned not to reify a test score?
it is only an estimate of an individuals true ability or mood level etc.
What is an example of a failure to consider reliability of measures?
Chapman and Chapman (1973): schizophrenic patients were compared to a healthy control sample on two tasks. the schizophrenic sample had severe deficit on one of the tasks. However, the tasks were the exact same tasks but one version rendered less reliable (via shortening the original test)
What did Nunnally and Bernstein (1994) generously propose as a reliability coefficient?
0.90
What did Sattler 2001 more recently suggest as a reliability coefficient?
0.70 and above
What are two methods to measure reliability?
Cronbach’s Alpha and Test-retest Reliability
What is Cronbach’s Alpha determined by?
the number of items in the test and the size of correlations between the items
What makes a test more reliable?
longer and with higher correlation
Explain why a vocab test with four words would be less reliable than one with many words?
there’s an enormous amount of words and a sample of four isn’t enough. some people may do better or worse with four particular words by chance than if all words were tested
Are longer tests unconditionally more reliable?
longer tests are only more reliable provided the items in the longer test are as good (highly correlated with other items) as shorter versions
What are short tests developed for and how do they stay reliable?
short tests are developed to save time. this can be done by only marginally lowering reliability because poor items (not high correlated with other items) are selectively dropped.
What’s the reliability of WAIS-IV subtest in UK and US?
they are all of or above 0.9
What subtests tend to be lower, how are these assessed, and why is it lower ?
processing speed subtests which use the test-retest method. Partially this is because it is made up of only two components (coding and symbol search)
What is the reliability of composites (e.g. IQ) related to?
a function of reliability of components (subtests) and correlation between the components
Under what circumstances do composites have superior reliability to components?
always: when the components are correlated
What is the nature of reliability of WAIS-IQ and FSIQ?
the reliability is among the highest of any psychological instrument
FSIQ has true variance of 0.98. what is the measurement error?
2% measurement error
What does temporal stability refer to?
the extent which a measure yields consistent scores over time
How is temporal stability tested?
using the test-retest method
What is the test-retest method?
correlation between scores at test and retest
What was the mean test re-test interval used in stability of WAIS-IV?
22 days
What did Deary et al 2000 find in a study that took place over 66 years?
found a correlation of 0.73 between an IQ administered age 11 and again at 77
why is there normally an interval between administrations?
to avoid inflating estimate of stability due to testee’s memory of previous items
What is the stability coefficient for WAIS-IV FSIQ?
0.96
true or false: temporal stability of components will be higher than temporal stability of composite?
false. temporal stability of composite will be higher than for its components
what is a complication when psychologists test if individuals cognitive abilities have genuinely improved or deteriorated?
practise effects
Why are practise effects problematic?
they may exaggerate false impressions of recovery or improvement, and mask a deterioration in functioning
Why is the WAIS bad for practise effects?
there are no alternative forms of WAIS and so the same test has to be administered in retesting an individual
True or false: high test reliability indicates absence of practise effects
False
Which subtests are particularly susceptible to practise effects?
psychomotor and subperceptual subtests
Is an identical score at retest a positive sign?
no, this is a cause for concern
How can psychologists deal with practise effects?
they can keep in mind to factor the effects of practise effects when interpreting a person’s score
What does HADS (Zigmond and Snaith 1983) stand for?
The Hospital Anxiety and Depression Scale
How is the HADS scored?
Likert Scale- rank between 0 and 4
List as many Anxiety items in the HADS as you can recall: (there’s 7)
-I feel tense or wound up
-I get a sort of frightening feeling as if something awful is about to happen
-worrying thoughts go through my mind
-I can sit at ease and feel relaxed
-I get a sort of frightening feeling like butterflies in the stomach
-I feel restless as if I have to be on the move
-I get sudden feelings of panic
List as many Depression items in the HADS as you can recall: (there’s 7)
-I still enjoy the things I used to enjoy
-I can laugh and see the funny side of things
-I feel cheerful
-I feels as if I am slowed down
-I have lost interest in my appearance
-I look forward with enjoyment to things
-I can enjoy a good book or radio or TV programme
What are the pro’s of using self report mood scales?
-they are quick to administer
-they are cheap to administer
-they are generally reliable
-the client/patient directly reports their feelings rather than them being filtered through ‘lens’ of a clinician’s
What agreement do psychologist’s have regarding testing?
we should use multiple indicators when possible such as self report and clinician’s interview. A patient’s self report scale responses can be raised in the clinician interview
Why are some items in HADS reverse scored?
to counter effects of acquiescence bias and as an attempt to have respondents pay attention to items. this is also a check on inattention or lack of motivation
How is the reliability of self report mood scales assessed?
Cronbach’s Alpha
Describe the reliability of HADS:
fairly high reliability but not as high as for some other self report scales. Cronbach’s alpha 0.84 for anxiety scale, and 0.78 for depression scale in general population sample. overall reliability 0.87 (Crawford et al., 2009)
Why does HADS not contain items that measure somatic or vegetative symptoms?
it was developed for use in general medical settings so items were chosen so that effects of a medical condition did not masquerade as depression or anxiety
In what way was the solution to not involve medical condition in the HADS flawed?
any major medical problem could lead people to endorse ‘I feel as if I’m slowed down’
Can HADS measure independent dimensions of anxiety and depression?
contrary to Zigmond and Snaith 1983, it cannot
What are Zigmond and Snaith’s original cut of scores for HADS (Normal, Mild, Moderate, Severe)?
Normal: 0-7
Mild: 8-10
Moderate: 11-15
Severe: >16
What did clinician’s decide about Zigmond and Snaith’s cut off for ‘mild’ and what is evidence for this?
the cut off for mild is very inclusive and should not be used to establish caseness.
Crawford et al 2001: reported 33% of general population sample scored 8 or above on anxiety scale
What is validity?
Does a test measure what is claims to measure. A valid test shown to be valid for a particular use, population, and time
What is validation?
The process of acquiring evidence and evaluation
What are three types of validity that add to pool of evidence?
content validity, criterion-related validity, construct validity
Define Face Validity:
does a test appear to measure what it claims to measure?
Why is face validity a potential problem for some neuropsychological tests?
Tests can appear like a child’s game, losing cooperation with adults when there is no clear focus
When is face validity undesirable?
detection of deception
Define Content Validity:
does the measure adequately sample that domain of interest
Describe content validity in context of education and depression:
education- does a test sample everything that was taught
depression- do items cover all core symptoms
How can content validity be evaluated?
experts writing or reviewing items, comparing against a formal established criteria for example does depression scale cover list of symptoms for a diagnosis of depression in DSM?
What are the top three depression scales when evaluated against the DSM criteria for depression?
1- Hamilton Rating Scale- 7 criteria addressed completely, 2 addressed partially
2-Beck Depression Inventory- 6 criteria addressed completely, 2 partially
3-Zung Self-Rating Depression Scale- 5 criteria addressed completely, 4 partially
What is the DSM-IV criteria for a Major Depressive Episode?
five or more criteria must be present during the same two week period, criteria one and two must be present.
There are 9 Criteria:
-depressed mood most of the day nearly everyday
-diminished interest or pleasure in all or almost all activities
-significant weight loss or gain and significant change in appetite
-insomnia or hypersomnia nearly everyday
-psychomotor agitation or retardation nearly every day
-fatigue or loss of energy nearly every day
-feelings of worthlessness or excessive guilt nearly every day
-diminished ability to think or concentrate or indecisiveness
-recurrent thoughts of death or suicidal ideation or a suicide attempt
At what points are individual symptoms of depression discredited towards diagnosis?
when symptoms are clearly due to a medical condition or when symptoms arise from effects of a substance
How many symptoms in the DSM-IV Depression Criteria are concerned with vegetative or psychomotor aspects?
4 out of 9. traditionally more emphasis has been placed on these in the UK
Do you have to report being depressed for a diagnosis of depression?
No, you can meet criteria without reporting being depressed e.g. in instances when patient might not know they are experiencing depression
In what 3 symptoms of depression in DSM-IV would a deviation from norm in either direction count towards a diagnosis?
Weight, Psychomotor, Sleep
Which depression scales do well in terms of content validity when evaluated against DSM criteria?
Hamilton Rating Scale, and Beck Depression Inventory
How is Hamilton Rating Scale commonly used?
widely used as a clinician’s rating scale, not a self report scale
What issue does PHQ-9 depression scale manage to tackle?
tackles the issue of content validity directly and explicitly. items are designed to index each of the nine DSM symptoms for depression
List as many items from PHQ-9 depression scale (Kroenke et al., 2001) as you can recall:
-little interest or pleasure in doing things
-feeling down, depressed, or hopeless
-trouble falling or staying asleep, or sleeping too much
-feeling tired or having little energy
-poor appetite or overeating
-feeling bad about self or that you are a failure, or that you have let your family down
-trouble concentrating on things such as reading newspaper or watching TV
-moving or speaking so slowly that others have noticed or the opposite being so fidgety or restless that you have been moving around a lot more than usual
-thoughts that you would be better off dead or hurting yourself in some way
Define ecological validity:
refers to the degree which test performance corresponds to real world performance
How can ecological validity be assessed?
test scores can be compared with rating of every day behaviour for the domain of interest using self and informant questionnaires, clinical rating scales, and observation of stimulates tasks
What is executive function?
executive function refers to skills in problem solving, decision making, planning and completion of tasks, and reflecting on activity
What are examples of dysexecutive problems?
starting or finishing tasks, planning ahead, making decisions, thinking through problems and forming solutions, behaving appropriately and controlling emotion such as anger
Describe the Dysexecutive Questionnaire (DEX, Burgess et al., 1998):
20 items measuring behavioural, cognitive, motivational, and emotional changes from pre morbid functioning generating
What can neuropsychological tests of executive functioning be compared to?
Dysexecutive Questionnaire (DEX) scores
What are four putative executive tests (Burgess et al., 1998) that correlate with DEX ratings, and are these significant?
-Phonemic Fluency
-Modified Card Sorting Test
-Six elements
-Cognitive estimates
all the correlations are significant with exception of Cognitive Estimate Task (CET)
Asides from DEX, what is another test that measures dysexecutive problems?
behavioural assessment of the dysexecutive syndrome (BADS) Wilson et al., 1996
Aside from Burgess reports, what information provides better correlations with ecological validity?
informant, clinical, parent or carer ratings
What factors can effect ecological validity test results?
environment, limited behaviour observed, compensatory strategies
Why are carer ratings of everyday functioning unreliable?
relatives may be protective or overly negative
Describe construct validity:
the broadest form of validity, does a test measure what it’s meant to test
How can construct validity be assessed?
constructs are unobservable but tests such as the WAIS can measure them. researchers can assess this by making predictions on how scores may change in various ways: test homogeneity, evidence from changes with age, evidence from pre test and post test changes, evidence from distinct groups, convergent and divergent evidence, and factor analysis
What are convergent, and Divergent evience?
Convergent=Demonstrates that two different measurement methods produce similar results for the same construct
Divergent= Demonstrates that a measurement of one construct is distinct from measurements of other constructs
What is factor analysis?
Factor analysis determines the underlying relationships between sets of variables such as test scores
What are factors in factor analysis?
relationships are called factors i.e. the construct such as intelligence or personality test
How is Factor Analysis used in psychometrics?
FA is used as a data reduction technique. It takes individual tests and the correlations between them. Patterns of scores clustering together suggest they are measuring the same thing
‘do you like going to parties/socialising/are you the life and soul of a party’ may all be constructs measuring what?
Extraversion
How may researchers use Factor Analysis data?
they can collect data and make specific predictions on how scores should correlate based on their theories
What has factor analysis traditionally been used to study?
Construct Validity
What is Confirmatory Factor Analysis used for?
a relatively recent technique for evaluating the construct validity of psychological tests. it is now widely used.