Issues of Measurement and Testing Flashcards
What is dimensionality?
Whether an instrument (test) measures a single constuct (dimension) or many
What is reliability?
Whether a test measures a construct consistently
What are psychometric properties expressed as?
An index, coefficient or other numerical quantity
What is standardisation?
- The process of establishing norms for a test
- The use of uniform procedures, same conditions, scored by same criteria allowing results to be compared
- The transformation of data into a distribution of standardised scores, often having a mean of 0 and SD of 1
APA Dictionary of Psychology (2023)
What are the 2 common standardisation methods?
Norm-referencing and Criterion -referencing
What is norm-referencing?
“Compares the score of an individual with those of other candidates who took the test under similar conditions (norm group) “ - (Rust,2007)
- norm group should be representative of whole population
- Allows for meaningful comparisons
What is criterion-referencing?
- Scores compared with some objectively assessed reference point or standard
- Not commonly used in personality testing, but potentially relevant for psychopathology diagnosis
What is a percentile rank?
A number where a certain % of scores fall below
What are z-scores?
The value of a z score tells you how many standard deviations you are away from the mean
What is validity?
The degree to which empirical evidence and theoretical rationales support the interpretations based on test scores or other measures (West & Finch, 2007)
What is reliability?
A measure of reproducibility or dependability of measurements, free from error
What is test-retest reliability?
Stability over time/repeatability
What is internal consistency?
Whether all items are measuring the same thing
What is inter-rater reliability?
The degree to which different raters’ scores/codes/ratings are correlated
What factors influence test retest reliability?
- Characteristics of test takers e.g. illness, tiredness
- Characteristics within tests - poor instructions, complexity
- Differences in conditions - time of day, distractions
- Time gaps - must be minimum of 3 months
- Difficulty level - floor/ceiling effects
- Sample size and sampling
What’s the most used index of internal consistency?
Cronbach’s Alpha - reliability should never be below.7
How can bias be avoided?
Several raters and ensure assessments are consistent across raters
What are the 2 common ways to measure inter-rater reliability?
Percent agreement and Cohen’s Kappa
What is a percent agreement?
% of items the judges agree on, between 0 and 1
What is Cohen’s Kappa?
Calculates the % of items raters agree on, while accounting for the fact that raters may happen to agree on items due to chance
- Ranges between 0 and -1, with -1 indicating systematic disagreement between raters
- 0.7-0.8 is acceptable
What is construct validity?
The degree to which an assessment tool adequately measures a hypothesised psychological construct
What are the different types of construct validity?
- Face
- Content
- Convergent
- Discriminant
- Predictive
What is face validity?
The extent to which a test appears to measure what it claims to based on fave value
What is content validity?
Concerned with a test’s ability to include or represent all of the contents of a particular construct
What is convergent validity?
The extent to which scores from a new test correlate with other measures of the same phenomenon
What is discriminant validity?
Refers to the extent to which a test score does not correlate with the scores of theoretically unrelated measures
What is predictive validity?
Evidence that a test score or other measurement correlates with another variable that can only be assessed at some point after the test has been administrated
What features of psychometric tools might influence the degree to which they are valid and reliable?
- Methods of data collection and bias
- Purposes of tests
- Cross culture validity?
What are the strengths of psychometric tests?
- Objective and scientific way of describing people and their behaviour
- Usually quick and easy
- Allows for statistical analysis
What are the weaknesses of psychometric tests?
- Difficult to make valid and reliable
- Culture bias
What are some sources of innacuracy and bias in self report measures?
Extreme responding, dissent bias (agree/disagree with questtionaire irrespective of content), SDB, Recall bias, hostility bias
How did Crowne (1964) theorise SDB?
Socially desirable responses reflect a repressive defense against vulnerable self-esteem
What did Zemore find about SDB and when?
Higher SDB scores have been linked with outcomes such as better attendence of drugs/alcohol treatment programmes
What is the Lees-Haley Fake Bad Scale (FBS)/ MMPI Symptom Validity Scale?
- 43 items in the Minnesota Multiphasic Personality Inventory selected by Less-Haley (1991) to detect malingering in personal injury claimants
What is impression management?
Ways people attempt to control how they are perceived by others (Goffman, 1959)
How do people mitigate against impression management?
- Lie scales to flag who is lying
- Forced choice items
- Inconsistency scales
- Multiple assessment methods (other than self-report)