Unit 4 Flashcards
Personality assessment
the measurement of the individual characteristics of a person.
What makes a personality test good? (or what is the difference b/w a test online and a legitimate test?):
Reliability
Validity
Specified conditions, populations, and cultures the test applies to
Proof that the test is related to certain outcomes
Findings published and peer reviewed in a scientific journal
Results can be replicated
Not readily available online
Reliability
estimate of the consistency of a test. It describes the extent to which test scores are consistent and reproducible with repeated measurements (across time, items, and raters). Reliability is a prerequisite to validity; a measure must be consistent in order to be valid.
Testing for reliability:
Temporal consistency (time)
Internal consistency (items)
Rater consistency (raters)
Temporal consistency (time)
demonstrates test-retest reliability: respondents take the test twice to see if scores are similar. Need to make sure they aren’t just remembering previous answers (memory effect) or performing better because they’ve taken the test before (practice effect). To eliminate those problems, tests must be taken long enough part.
Internal consistency (items)
demonstrates if the different items of the test give similar results. Earlier tests for this consistency were: parallel-forms reliability (compare two versions of a test and checked scores for similarity) and split-half reliability (split the test in half to see if scores on one half correlated with scores on the other). Now, a statistic is used instead called Cronbach’s Alpha (α): take the correlation b/w the scores of two halves of a test and then calculate the average correlation of all possible halves of the test. It estimates the generalizability of the score from one set of items to another. An alpha of 0.70-0.80 is good; even higher for IQ tests (0.90-0.95).
Rater consistency (raters)
demonstrates interrater reliability by having two separate judges rate the personality or behaviour of a third person, then finding the average correlation or percentage of agreement. Too low and the test could be too ambiguous or the judges not understanding what’s being rated.
Validity
the extent to which a test measures what it is supposed to measure.
Testing for validity:
Construct validity: Face validity Criterion validity Convergent validity Discriminant validity Predictive Validity
Construct validity:
Every test aims to measure an underlying concept called a construct, derived from a theory. This is the extent to which the test successfully measures the theoretical concept it was designed to be measuring.
Face validity
if the test appears to measure the construct of interest. Useful in two conditions: in situations where the cooperation/motivation of the test-taker can affect results (they see it as relevant/useful so they take it seriously) or when developing new measures to test they give the test to see which items are actually related to the concepts they want to measure. This type of validity is not good enough to determine if a test is valid. Other types are needed.
Criterion validity
determines how good a test is by comparing the results of a test to an external standard like another personality test or behavioural outcome.
Convergent validity
if the test is similar to other tests of the same or related constructs
Discriminant validity
if the test is different than unrelated concepts. To prove construct
validity, neither of the last two tests alone are sufficient; need to prove that the test does BOTH:
converges with similar concepts and discriminates b/w dissimilar ones.
Predictive Validity
if the test gives specific feedback to a person or group who share a
characteristic