Study Guide Exam 2 (Assessment and Diagnosis) Flashcards
Norm samples: what they need to be
Representative of the population taking the test
Consistent with that population
Current (must match current generation)
Large enough sample size
Flynn effect
Intelligence increases over successive generations
In order to stay accurate, intelligence tests must be renormed every couple of years
Types of norm samples
Nationally representative sample (reflects society as a whole)
Local sample
Clinical sample (compare to people with given diagnosis)
Criminal sample (utilizing criminals)
Employee sample (used in hiring decisions)
Ungrouped frequency distributions
For each score/criteria, number of people/items that fit criteria are listed
Grouped frequency distributions
Scores are grouped (ex: 90-100) and number of people whose scores lie in that range are listed
Frequency graphs
Histograms
Mean
Arithmetic average
Median
Point that divides distribution in half
Mode
Most frequent score
Which measure of central tendency to pick
Normal distribution: mean
Skewed distribution: median
Nominal data: mode
Positions of mean and median in positively and negatively skewed distributions
Positively skewed (right skewed): mean is higher than median Negatively skewed (left skewed): median is higher than mean
Standard deviations
Average distance of scores and how far they vary from mean
Raw scores
Number of questions answered correctly on a test
Only used to calculate other scores
Percentile ranks
Percentage of people scoring below
z scores
M=0
SD=1
t scores
M=50
SD=10
IQ scores
M=100
SD=15
Content sampling error
Difference between sample of items on test and total domain of items
Time sampling error
Random fluctuations in performance over time
Can be due to examinee (fatigue, illness, anxiety, maturation) or due to environment (distractions, temperature)
Interrater differences
When scoring is subjective, different scorers may score answers differently
Test-retest reliability
Administer the same test on 2 occasions
Correlate the scores from both administrations
Sensitive to sampling error
Things to consider surrounding test-retest reliability
Length of interval between testing
Activities during interval (distraction or not)
Carry-over effects from one test to next
Alternate-form reliability
Develop two parallel forms of test
Administer both forms (simultaneously or delayed)
Correlate the scores of the different forms
Sensitive to content sampling error (simultaneous and delayed) and time sampling error (delayed only)
Things to consider surrounding alternate-form reliability
Few tests have alternate forms
Reduction of carry-over effects
Split-half reliability
Administer the test
Divide it into 2 equivalent halves
Correlate the scores for the half tests
Sensitive to content sampling error
Things to consider surrounding split-half reliability
Only 1 administration (no time sampling error)
How to split test up
Short tests have worse reliability
Kuder-Richardson and coefficient (Cronbach’s) alpha
Administer test
Compare each item to all other items
Use KR-20 for dichotomous answers and Cronbach’s alpha for any type of variable
Sensitive to content sampling error and item heterogeneity
Measures internal consistency
Inter-rater reliability
Administer test
2 individuals score test
Calculate agreement between scores
Sensitive to differences between raters
High-stake decision tests: reliability coefficient used
Greater than 0.9 or 0.95
General clinical use: reliability coefficient used
Greater than 0.8
Class tests and screening tests: reliability coefficient used
Greater than 0.7
Content validity
Degree to which the items on the test are representative of the behavior the test was designed to sample
How content validity is determined
Expert judges systematically review the test content
Evaluate item relevance and content coverage
Criterion-related validity
Degree to which the test is effective in estimating performance on an outcome measure
Predictive validity
Form of criterion-related validity
Time interval between test and criterion
Example: ACT and college performance
Concurrent validity
Form of criterion-related validity
Test and criterion are measured at same time
Example: language test and GPA
Construct validity
Degree to which test measures what it is designed to measure
Convergent validity
Form of construct validity
Correlate test scores with tests of same or similar construct to determine
Discriminant validity
Form of construct validity
Correlate test scores with tests of dissimilar construct to determine
Incremental validity
Determines if the test provides a gain over another test
Face validity
Determines if the test appears to measure what it is designed to measure
Not a true form of validity
Problem with tests high in these: can fake them
Type of material that should be used on a matching test
Homogenous material (all items should relate to a common theme)
Multiple choice tests: what kinds of stems should not be included?
Negatively-stated ones
Unclear ones
Multiple choice tests: how many alternatives should be given?
3-5
Multiple choice tests: what makes a bad alternative?
Long
Grammatically incorrect in question
Implausible
Multiple choice tests: how should placement of correct answer be determined?
Random (otherwise, examinees can detect pattern)
Multiple choice tests, true/false tests, and typical response tests: what kind of wording should be avoided?
“Never” or “always” for all 3
“Usually” for true/false
“All of the above” or “none of the above” for multiple choice
True/false tests: how many ideas per item?
1
True/false tests: what should be the ratio of true to false answers?
1:1
Matching tests: ratio of responses to stems?
More responses than stems (make it possible to get only 1 wrong)
Matching tests: how long should responses and lists be?
Brief
Essay tests and short answer tests: what needs to be created?
Scoring rubric
Essay tests: what kinds of material should be covered?
Objectives that can’t be easily measured with selected-response items
Essay tests: how should grading be done?
Blindly
Short answer tests: how long should answers be?
Questions should be able to be answered in only a few words
Short answer tests: how many correct responses?
1
Short answer tests: for quantitative items, what should be specified?
Desired level of precision
Short answer tests: how many blanks should be included? How long should they be?
Only 1 blank included
Should be long enough to write out answer
Otherwise, becomes dead giveaway
Short answer tests: where should blanks be included?
At the end of the sentence
Typical response tests: what should be covered?
Focus items on experiences (thoughts, feelings, behaviors)
Limit items to a single experience
Typical response tests: what kinds of questions should be avoided?
Items that will be answered universally the same
Leading questions
Typical response tests: how should response scales be constructed?
If neutral option is desired, have odd numbered scale
High numbers shouldn’t always represent the same thing
Options should be labeled as Likert-type scale (rating from 0-7, etc.)
Spearman
Identified a general intelligence “G”
Underlies everything else about you
Cattell-Horn-Carroll
10 types of intelligence theory
3 abilities incorporated by most definitions of intelligence
Problem solving
Abstract reasoning
Ability to acquire knowledge
Original determination of IQ (used by Binet)
Mental age/chronological age * 100
How IQ is currently determined
Raw score compared to age/grade appropriate norm sample
M=100, SD=15
Why professionals have a love/hate relationship with intelligence tests
Good: reliable and valid (psychometrically sound, predict academic success, fairly stable over time)
Bad: limited (make complex construct into 1 number), misunderstood and overused
Group administered tests: who administers and who scores?
Standardized: anyone can administer (teachers, etc.), but professionals interpret
Group administered tests: content focuses on which skills most?
Verbal skills
Examples of group-administered aptitude tests
Otis-Lennon School Ability Test
American College Test (ACT)
Individually administered tests: how standardized?
Very standardized
No feedback given during testing regarding performance or test
Additional queries only when specified (only can say “Tell me more about that.”)
Answers are recorded verbatim
Individually administered tests: starting point
Starting point determined by age/grade
Reversals sometimes needed (person gets 1st question wrong: must back down in level)
Individually administered tests: ending point
Testing ends when person answers 5 questions wrong in a row
Individually administered tests: skills tested
Verbal and performance
3 individually administered IQ tests for adults
Wechsler Adult Intelligence Scale (WAIS; most commonly used)
Stanford-Binet
Woodcock-Johnson Tests of Cognitive Abilities
Child version of Wechsler Adult Intelligence Scale
Wechsler Intelligence Scale for Children (WISC)
WAIS: subtests and index scores
15 subtests combine to make 4 index scores: Verbal Comprehension Index (VCI), Perceptual Reasoning Index (PRI), Working Memory Index (WMI), Processing Speed Index (PSI)
4 index scores combined to make Full Scale IQ score
WAIS: norm set
Older teenagers to elderly
WISC: basics
2-3 hours to administer and score
Administered by professionals
Normed for children elementary-aged to older adolescence
Stanford-Binet: norm set
Young children to elderly
Stanford-Binet: IQ scores
3 composite IQ scores: verbal IQ, nonverbal IQ, full scale IQ
Score range difference between WAIS/WISC and Stanford-Binet
Stanford-Binet: possible to score higher than 160 (not possible for WAIS or WISC)
Woodcock-Johnson: norm set
Young children to elderly
What Woodcock-Johnson is based on
Cattell-Horn-Carroll theory of 10 types of intelligence
Woodcock-Johnson full scale IQ
Based on comprehensive assessment of Cattell-Horn-Carrol abilities
Full scale IQ
Overall, composite IQ (# reported)
What kind of a construct is IQ?
Unitary construct
2 disorders that include intelligence in the criteria
Intellectual disability (IQ less than 70, impairments across multiple domains- occupational, educational, social function, activities of daily living) Learning disorders (discrepancy between intelligence and achievement; math, reading, written expression) Neither is based on intelligence alone
Response to intervention
Method of preventing struggling students from being placed in special ed
Students are provided regular instruction: progress is monitored
If they don’t progress, they get additional instruction: progress is monitored
Those who still don’t respond receive special education or special education evaluation
Achievement definition
Knowledge in a skill or content domain in which one has received instruction
Aptitude vs. achievement
Aptitude measures cognitive abilities/ knowledge accumulated across life experience
Achievement measures learning due to instruction
Group administered achievement tests
Can be administered by anyone, but interpreted by professionals
Standardized
Items increase in difficulty as exam progresses
Time limits often included
Often focus on verbal skills
Examples of group administered achievement tests
Stanford Achievement Tests
Iowa Tests of Basic Skills (Iowa Basics)
California Achievement Tests
What individually administered achievement tests are used for
Used to determine presence of learning disorders
Standardization of individually administered achievement tests
No feedback given during testing regarding performance or test
Additional queries used only when specified
Answers are recorded verbatim
Examples of individually administered achievement tests
Wechsler Individual Achievement Test
Woodcock-Johnson Tests of Achievement
Wide Range Achievement Test
Wechsler Individual Achievement Test: norm set and areas tested
Normed for young children to elderly
Scores: reading, math, written language (handwriting), oral language
Woodcock-Johnson Tests of Achievement: norm set and areas tested
Normed for young children to elderly
Scores: reading, oral language, math, writing
Wide Range Achievement Test: norm set and areas tested
Normed for young children to elderly
Scores: word reading, reading comprehension, spelling, math
How Wide Range Achievement Test differs from other 2
WRAT is used as a screening test: it takes only 20-30 minutes to administer (others take 1.5-3 hours)
Other examples of achievement tests
School tests (teacher-constructed tests)
Psych GRE
MCAT
Licensing exams (EPPP- psychologists)
Personality
Characteristic way of behaving/thinking across situations
Uses for personality assessments
Diagnosis Treatment planning Self-understanding Identifying children with emotional/behavioral problems Hiring decisions Legal questions
Woodworth
Developed first personality test (Personal Data Sheet)
Trait vs. state
Trait: stable internal characteristic, test-retest reliability can be greater than 0.8
State: transient, lower test-retest reliability
Response set
Unconscious responding in a negative or positive manner
Test taker bias that affects formal personality assessment
Dissimulation
Faking the test
Increases with face validity
Test taker bias that affects formal personality assessment
Validity scales
Used to detect individuals not responding in an accurate manner on personality assessments
Content rational approach
Similar to process of determining content validity: expert looks at test and decides if it represents what it should be testing
Empirical criterion keying
Large pool of items is administered to 2 groups: clinical group with specific diagnosis and control group
Items that discriminate between groups are retained (may or may not be directly associated with psychopathology- not necessarily face valid)
Minnesota Multiphasic Personality Inventory (MMPI)
Most used personality measure
Developed using empirical criterion keying
Contains validity scales (detect random responding, lying, etc.)
Adequate reliability
10 clinical scales
Hypochondriasis
Clinical scale on MMPI
Somatic complaints
Depression
Clinical scale on MMPI
Pessimism, hopelessness, discouragement
Hysteria
Clinical scale on MMPI
Development of physical symptoms in response to stress
Psychopathic deviate
Clinical scale on MMPI
Difficulty incorporating societal standards and values
Masculinity/femininity
Clinical scale on MMPI
Tendency to reject stereotypical gender norms
Paranoia
Clinical scale on MMPI
Paranoid delusions
Psychasthenia
Clinical scale on MMPI
Anxiety, agitation, discomfort
Schizophrenia
Clinical scale on MMPI
Psychotic symptoms, confusion, disorientation
Hypomania
Clinical scale on MMPI
High energy levels, narcissism, possibly mania
Social introversion
Clinical scale on MMPI
Prefers being alone to being with others
Factor analysis
Statistical approach to personality assessment development
Evaluates the presence/structure of latent constructs
NEO Personality Inventory
Developed using factor analysis
5-factor model (Neuroticism, Extraversion, Openness, Agreeableness, Conscientiousness)
Pretty good reliability and validity
Theoretical approach
Match test to theory
Myers-Briggs Type Indicator
Developed using theoretical approach
Based on Jung’s theories
4 scales: introversion (I)/extraversion (E), sensing (S)/intuition (N), thinking (T)/feeling (F), judging (J)/perceiving (P)
Personality represented by one of 16 4 letter combinations
Millon Clinical Multiaxial Inventory (MCMI)
Developed using theoretical approach
Based on Millon’s theories surrounding personality disorders
2 scales: clinical personality scales and clinical syndrome scales
Good reliability and validity, but high correlations between scales (problem)
Objective personality assessments given to children
Child Behavior Checklist
Barkley Scales (ADHD)
Each test has a version for the parent, a version for the child, and a version for the teacher to fill out
Broad-band vs. symptom measures
Broad-band: lots of info on a variety of topics, allow for a comprehensive view (example: MMPI)
Symptom measure: identify specific symptoms (example: Beck Depression Inventory)
Ink blot test
Examinee is presented with an ambiguous inkblot and asked to identify what they see
Limited validity
Rorschach ink blot test scoring/interpreting
Exner developed most comprehensive system for scoring (including norm set)
Limited validity, though
Apperception tests
Given an ambiguous picture, examinee must make up story
Themes presented in stories tell something about examinee
Have issues with validity
Projective drawings: advantage
Require little verbal abilities/ child friendly
House-tree-person test
House-tree-person test (house: home life and family relationships, tree: deep feelings about self, person: less deep view of self)
Pros and cons of projective tests
Pros: popular in clinical settings, supply rich information (not a lot of face validity)
Cons: questionable psychometrics (poor reliability and validity), so should be used with caution
Anatomical dolls
Controversial assessment technique
Used to assess sexual assault in children (watch what child is paying attention to, how child plays with doll, etc.)
Lots of false positives
Hypnosis assisted assessment and drug assisted assessment
Controversial assessment technique
Truth serum (sodium amytol): help people relax and share difficult information
Help people relax and remember things
People under hypnosis or sodium amytol are suggestible to forming false memories
Neuropsychology
Study of brain-behavior relationships
Neurology vs. neuropsychology
Neurologist focuses on anatomy and physiology of brain
Neuropsychologist focuses on functional product (behavior and cognition) of CNS dysfunction
Uses of neuropsychology
Identify damaged areas of brain
Identify impairments caused by damage
Assessing brain function
Common referral questions
Traumatic brain injury Cerebrovascular accidents (example: stroke) Tumors Dementia and delirium Neurological conditions
A thorough neuropsychological assessment includes…
Higher order information processing Anterior and posterior cortical regions Presence of specific deficits Intact functional systems Affect, personality, behavior
Fixed battery
Comprehensive, standard set of tests administered to everyone
Take a long time to administer (about 10 hours)
Most commonly used fixed battery
Halstead-Reitan Neuropsychological Test Battery for Adults (HRNB)
Flexible battery
Flexible combination of tests to address specific referral question
Brief screeners
Quickly administered tests that provide general information on functioning
Used to determine whether more testing is needed
Example: mini mental status exam
Memory assessments
Memory is impaired in functional and organic disorders (forgetting recent events)
Can be used to discriminate between psychiatric disorders and brain injury (forgetting is common in brain injury but not in psychiatric disorders)
Most commonly used memory test
Wechsler Memory Scale
Continuous performance tests
Used to assess attention (ADHD diagnosis, etc.) Boring tasks (press a key when an x shows up on the screen, etc.): measure how well person stays with them
Executive function tests
Stroop task: measure ability to ignore reading word (name color of ink only)
Wisconsin card sort: measure adaptability to new rules
Delay discounting: measure ability to delay gratification in order to gain a greater outcome later on
Motor function tests
Grip strength
Finger tapping test
Purdue pegboard (fine motor skills: put pegs on peg board, put washers on pegs)
Sensory functioning tests
Clock drawing test Facial recognition test Left-right orientation Smell identification Finger orientation
Language functioning tests
Measure ability to develop language skills and ability to use language
Example of language functioning test
Expressive Vocabulary Test
Boston Diagnostic Aphasia Examination
Normative approach to interpretation
Compare current performance against normative standard
Inferences made within context of premorbid ability
Ideographic approach to interpretation
Compare within the individual: compare current scores to previous scores or estimates of premorbid functioning
How to estimate premorbid functioning
Prior testing
Reviewing records
Clinical interview (“What were you like beforehand?”)
Interviewing others
Demographic estimation (assuming that you were average)
Hold tests (tests that are resistant to brain damage, such as vocabulary- scores are used to estimate IQ)
Pattern analysis approach to interpretation
Patterns across tasks differentiate functional/dysfunctional systems
Pathognomonic signs
Signs that are highly indicative of dysfunction
ABCs of behavioral assessment
A: antecedent (what was happening before behavior took place)
B: behavior (what did the person do)
C: consequent (what happened after the behavior took place)
Direct observation
Method of behavioral assessment
Observe behavior in its context (real world)
Analogue assessment
Method of behavioral assessment
Simulate real world events in a therapy setting through role play
Indirect observation
Client monitors observations through self-monitoring (recording behavior) or self-report (remembering what happened after the fact)
Behavioral interview
Clinical interview focusing on ABCs
Relies on self-report
Sources of information for behavioral assessment
Client Therapist Parents Teachers Spouses Friends
Pros and cons for behavioral assessment
Pros: direct information, contextual
Cons: labor intensive, reactivity, not everything is observable
Reactivity
Problem with direct observation: behavior changes when being observed
Decreases as observation time increases
Settings for behavioral assessment
School
Home
Therapy setting
Real world is preferable to therapy setting
Formal inventories
Used to enable comparison across people (standardization)
Informants rate behavior on a number of dimensions
Parents, teachers, spouse, child, etc.
Formal inventories: broad-based vs. single domain
Broad based: cover a number of behaviors/disorders (example: Achenbach)
Single domain: assess behavior for 1 disorder (example: Childhood Autism Rating Scale, Barkley Scales- ADHD)
Psychophysiology
Used to record internal behavior/physiological responses
EEG
Used in psychophysiology
Measures brain waves by measuring electrical activity across scalp
GSR (Galvanic skin response)
Used in psychophysiology
Measures sweat
Settings for forensic psychology
Prison (most common) Police departments Law firms Government agencies Private practice (consultants)
Role of psychologists in court
Provide testimony as an expert witness
Expert witness
Person who possesses knowledge and expertise necessary to assist judge/jury
Objective presentation is goal
Differences between clinical and forensic assessment: purpose
Clinical: purpose is diagnosis and treatment
Forensic: purpose is gaining information for court
Differences between clinical and forensic assessment: participation
Clinical: participation is voluntary
Forensic: participation is involuntary
Differences between clinical and forensic assessment: confidentiality
Clinical: confidentiality
Forensic: no confidentiality
Differences between clinical and forensic assessment: setting
Clinical: office
Forensic: jail
Differences between clinical and forensic assessment: testing attitude
Clinical: positive, genuine
Forensic: hostile, coached (by lawyer; malingering is a big concern)
Not guilty by reason of insanity (NGRI)
At the time of the offense the defendant, by reason of mental illness or mental defect, did not know his/her conduct was wrong
Used in less than 1% of felony cases; successful in about 25%
Results in mandatory hospitalization (prison-based state hospital; stay until person is no longer a danger)
NGRI defense: what assessment involves
Review of case records
Review of mental health history
Clinical interview
Psychological testing
Competency to be sentenced
Criminal is required to understand reason for punishment
If cannot understand reason for punishment, don’t receive it
Rarely contested: most common cases of contesting are capital cases
Mitigation in sentencing
Determining whether circumstances exist that lessen moral culpability
Examples: crime of passion, brain injury causing impulsivity
Evaluate probability of future violence
Juvenile tried as adult
Determining whether to transfer juvenile to adult court
Decision is based on cognitive, emotional, and moral maturity
Capital sentencing and intellectual disability
Execution of people with intellectual disabilities is outlawed
Testing assesses cognitive capacity
Personal injury litigation
Attempt to seek recovery of actual damages (out of pocket costs) and/or punitive damages (grief/emotional distress)
Psychologist has to determine presence of CNS damage, emotional injury assessment, quantify degree of injury, and verify injury actually took place
Divorce and child custody
Must determine best interests of children
Assess parent factors and child factors
Civil competency
Determining whether person is able to manage his/her affairs, make medical decisions, and waive rights
Neuropsych testing used
Other civil matters relating to children
Child abuse/neglect investigations
Removing children from the home
Adoption considerations
Admissibility
Expert standing doesn’t guarantee testimony will be accepted
Daubert standard
Expert’s reasoning/methods must be reliable, logical, and scientific
Credible link between reasoning and conclusion
Credibility must be established as well
Third-party observers
Attorneys or other experts may ask to be present during assessment
Issues: standardization procedures, professional standards, test security
Demographic factors that serve as a potential basis for bias
Intelligence scores are often higher for Whites than for Blacks, Hispanics, or Native Americans
Intelligence scores are often higher for Asian Americans than for Whites
Explanations for differences in psychological assessments
Genetic factors
Environmental factors (SES, education, culture)
Gene-environment interaction
Test bias
Bias
Systematic influence that distorts measurement or prediction by test scores (systematic difference in test scores)
Fairness
Moral, philosophical, legal issue
Is it okay that differences across groups exist on assessments?
Offensiveness
Content that is viewed as offensive or demeaning
Inappropriate content
Source of potential bias
Minority children haven’t been exposed to the content on the test or needed for test
Inappropriate standardization samples
Source of potential bias
Minorities are underrepresented in standardization samples
Examiner and language bias
Source of potential bias
Most psychologists are White and speak standard English
May intimidate ethnic minorities
Difficulties communicating accurately with minority children
Inequitable social consequences
Objection to testing
Consequences of testing results different for minorities
Perceived as unable to learn, assigned to dead-end jobs, previous discrimination, labeling effects
Measurement of different constructs
Source of potential bias
Tests measure different constructs when used with minorities
Differential predictive validity
Source of potential bias
Valid predictions apply for one group, but not for another
Qualitatively distinct aptitude and personality
Source of potential bias
Minority/majority groups possess qualitatively different aptitude and personality structure
Test development should begin with different definitions for different groups
Cultural loading
Degree of cultural specificity present in the test
Test can be culturally loaded without being culturally biased
Culture free tests
Several attempts have been made to create these, but ultimately have been unsuccessful
Ways to reduce bias on tests
Use minority review panels to look for cultural loading (problem: high disagreement)
Factor analysis: use statistics to determine if questions differ across groups
Assess across groups (does it work for everyone?)
Evidence for cultural bias
Little evidence exists (well-developed, standardized tests show little bias)
General ethics to consider
Stick to referral question
Match test to your purpose
Consider reliability and validity
Understand norm sample
Using testing in context
Use multiple measures to converge on a diagnosis
Attend to behavior observations
Client considerations
Informed consent
Involve client in decisions
Maintain confidentiality
Be sensitive in presenting results
Other considerations
Maintain test security
Don’t practice outside of expertise
Cultural sensitivity