Assessment and Dx Flashcards
Name purposes of psychological assessment.
Screening, measuring specific traits, determination of risk, diagnosis, vocational planning, intervention planning, evaluating treatment outcomes.
Define psychometrics.
Branch of psychology concerned with the quantification and measurement of psychological variables, as well as with the design, analysis, and improvement of measures.
What was the Binet-Simon scale?
The first intelligence test developed for French school children to evaluate children for intellectual disability.
What is behavioral assessment and ecological assessment?
Behavioral assessment relies on hypothesis-testing, is used to describe a particular behavior/pattern and identify what triggers and maintains the behavior.
Ecological assessment uses observational methods to examine physical (e.g., lighting, noise) and psychological variables (e.g., relationships with others) that influence behavior in a given environment. Helps to determine whether individuals behave differently in different environments.
What is standardization?
What is a norm-referenced measure?
What is a criterion-referenced measure?
Standardization: Process of administering a measure to a representative sample and developing norms.
Norm-referenced measure: standardized measure that compares examinee’s performance to a reference population.
Criterion-referenced measure: Examines where person stands on a particular criterion (skill, knowledge area). A test that determine’s a person’s level of accomplishment of the material covered on the test according to some reference point (e.g., licensing exam).
Describe narrative recording assessments, interval recording methods (aka time-sampling), and event-sampling methods.
Narrative recording assessments: running record of individual’s behavior througout the assessment.
Time-sampling methods (interval recording): operationally define target behavior, then record whether target behavior happens during each time interval. Good for frequent behaviors or those with no clear start/end.
Event-sampling methods: operationally define target behavior, then record frequency of behavior during whole observation. Good for less frequent behaviors.
Describe rating recordings.
Examiner rates a behavior in terms of intensity or duration on a Likert scale. Inter-rater reliability not so good.
Describe a functional behavioral assessment.
During an FBA, a problem/target behavior is identified, antecedents and consequences are identified (function of the behavior), intervention plan is formulated, intervention implemented, response to intervention is evaluated and intervention plan is adjusted accordingly.
Describe assessment centers and work sample methods used in industrial and organizational (I/O) psychology.
In assessment centers, person is evaluated on job-related skills by doing behavioral simulation exercises that reflect job content and types of problems faced on the job. Also includes other tests like cognitive ability tests, personality tests, and job knowledge tests. At the end, raters meet to discuss candidate.
Work samples are used to assess job potential by asking people to do tasks that simulate the job (ex, give a presentation, role-play customer interaction, etc.)
Describe classical test theory (CTT).
According to CTT, an individual’s observed score on a measure is composed of their true score plus measurement error. True score is the average score you would achieve given infinite administrations.
S = t + e
raw score = true score + error
Based on Spearman’s work.
Describe Generalizability Theory (G Theory)
G Theory is based on CTT but goes further to identify specific sources of measurement error (test forms, test items, circumstances of testing, rater). Conduct a generalizability study to quantify individual sources of measurement error and to determine conditions under which observations will be consistent and applicable accross contexts.
What is item response theory (IRT) and what three aspects of items does it focus on? What is the Item Characteristic Curve (ICC) or Item Response Function (IRF)?
Focuses on individual test items during test development, measures the relationship between individual items and construct being measured (latent trait). They look at item difficulty (% of people that get it correct), item discrimination (how the item discriminates between those who do well vs poorly on the test as a whole), and probability item is answered correctly by guessing.
ICC (or IRF) is a plot of the probability that a test item is answered correctly against the examinee’s underlying ability on the trait being measured. Also, percentage of individuals in different ability groups that answer it correctly. Ranges from 0-1.
What is reliability (as defined by G theory)? What is a reliability coefficient?
What is test-retest reliability?
What is the alternate form reliability coefficient?
G theory defines reliability as the degree to which testing is free from measurement error.
Reliability coefficient = r; describes consistency of scores across contexts; 1.00 is perfect reliability, 0.00 is absence of reliability.
Test-retest reliability: stability of scores over time; calculated by administering same test twice.
Alternate-form reliablity coefficient: two forms of same test are administered, and examinees’ two scores are correlated. Alternate forms reduce practice effects.
What is internal consistency reliability?
What is split-half reliability?
What is inter-item reliability?
What is inter-rater reliability?
Internal consistency reliability: degree of interrelationship among test items; are they consistent and measuring the same thing.
Split-half reliability: measures internal consistency of a test by dividing test in halves (odds & evens) and correlating each half.
Inter-item reliability: measures degree of consistency between multiple items measuring the same construct.
Inter-rater reliability: extent to which independent raters reach the same scores; expressed as a correlation coefficient. If it’s high, researcher knows that trained individuals produce similar scores.
What is validity and validity coefficient?
What is face validity?
What is criterion validity?
What is content validity?
What is cosntruct validity?
Validity: the extent to which a test accurately measures what it’s supposed to measure.
Validity coefficient is a correlation coefficient between test scores and criterion indicator/other measure
Face validity: extent to which a measure appears appropriate to the examinee; impacts response behaviors.
Criterion validity: index of how well a test correlates with an established standard of comparison (i.e., a criterion). Includes concurrent and predictive.
Content validity: how well a test includes representative information about subject matter/behavior it’s measuring.
Construct validity: extent to which a test measures a trait, concept or other theoretical entity. Includes convergent and discriminant validity.
What is the holy trinity of validity?
Criterion validity, content validity, construct validity.
What is concurrent validity?
What is predictive validity?
Concurrent validity: correlation between measure of interest and established measure (criterion) administered at the same time.
Predictive validity: correlation between measure of interest and a later outcome measure (criterion).
Define sensitivity and specificity.
Sensitivity: proportion of people accurately identified as having a trait.
Specificity: proportion of people accurately identified as not having a trait.
Related to type 1 error (false positive) and type 2 error (false negative)
What is a multi-trait multi-method matrix?
Used to establish construct validity. Compares a new measure for a trait with an existing measure for the same trait that uses a different method, as well as a measure that uses the same method but measures a different trait. Yields convergent validity (correlation with other measures of same trait) and divergent validity (extent to which it does not correlate with measures of other traits, measured by the heterotrait monomethod coefficient).
What is test bias?
What is test fairness?
Test bias: when there is systematic variation/error leading to impartial measurement across groups; over or underestimates performance for members of a specific group.
Test fairness: extent to which a test is used fairly, to classify a criterion. According to the Standards for Educational and Psychological Testing, you have test fairness as
* lack of bias
* equitable treatment in testing process
* equality in outcomes of testing
* opportunity to learn
Broadly describe the psychometric properties of the BDI-II.
Good validity (content, divergent, concurrent) and reliability (test-retest, internal consistency)
What does the State-Trait Anxiety Inventory (STAI) measure? What is state vs trait anxiety?
The State-Trait Anxiety Inventory measures state-associated anxiety x’s, as well as trait anxiety (stable personality traits consistent with anxiety).
What are some clinician rating scales for mood used with adults?
- Hamilton Rating Scale of Depression
- Hamilton Rating Scale of Anxiety
What model of intelligence are most current intelligence tests based on?
Cattell-Horn-Carroll theory of intelligence.