Chapter 16: Faking Flashcards
Response Sets
define it
Cronbach (1946) “defined a response set as any tendency that might cause a person to consistently give different responses to test items than he/she would have given if the same content was presented in a
different form.”
What we don’t know is what portion of the variance is “error” and what portion is reliable variance for “stylistic” personality differences
They require us to assess them independently
Response Sets
1. Tendency to gamble
could include guessing, not answering, to select a neutral response alternative, to be cautious
* Can increase reliability because have greater individual
differences
* Reduces validity
Response Sets
2. Definition of judgment categories
How do subjects define response categories?
Most tests require the subject to respond using given response categories, such as the Likert response scale. But different subjects give different meanings to the response options, e.g., the mean- ing assigned to such response categories as “frequently”: Does that mean every day? Six times a day? Once a week?
Response Sets
3. Inclusiveness
When the subject can make as many responses as he or she likes, some individuals make more responses than others. This occurs not only on essay exams, where one per- son’s answer may be substantially longer, but also on tests such as the Rorschach, where one per- son may see many more percepts on an inkblot, or the Adjective Check List, where one person may endorse substantially more items as self- descriptive.
Response Sets
4. Bias or acquiescence
person is more likely to endorse “true” or “yes” responses to dichotomous items
* False items are more valid
* True items are less valid
Response Sets
5. Speed versus accuracy
Where speed of response is an important element, the respondent can answer carefully, sacrificing speed, or can answer rapidly, sacrificing accuracy.
Response Sets
6. Response sets on essay tests
there are multiple response sets because it depends on characteristics of the person (e.g., how organized, fluent, detail-oriented)
Response Sets
how to deal with biases
They are seen as potential threats to reliability and validity.
The following are some of the ways that have been suggested to deal with these biases:
1. Have one or more measures of response bias incorporated in the self-report measure (e.g., MMPI, CPI)
2. Compare (typically correlate) the results of a self-report measure with a measure of bias
3. Determine how susceptible a scale is to faking; ask people to complete using different directions (e.g., fake good, fake bad, standard) and see how the responses differ
Faking
define
“Deliberate systematic distortion of the responses given to test items because the respondent wishes to create a particular impression”
May be made up of two components:
* Emphasis on socially desirable characteristics
* Denial of negative characteristics
Also referred to as “impression management” and “response bias”
Faking
Rogers (1984) described patterns of responding:
- Honest responding – sincere attempt to be accurate
- Irrelevant responding – response not relevant to item content (e.g., answer randomly)
- Defensiveness – conscious denial or minimization
- Malingering – conscious fabrication or exaggeration (e.g., to obtain an external incentive such as monetary compensation, avoiding punishment for criminal behavior, or avoiding military duty)
Incidence of Faking
It is difficult to determine how frequently faking occurs because most of the time, we don’t find out about it
There is disagreement on how frequently it does occur
Seems to depend on the sample and the circumstances of the test administration
Research has shown that people can fake when they are instructed to do so
Legal Issues
How often does malingering occur in criminal populations?
What is the criteria?
the malingering of psychosis is of special concern as a defendant may be found legally insane or incompetent to stand trial based on such a diagnosis
Another issue is that, traditionally, clinicians have worked for the client; if testing is to be administered it is for the joint effort of clinician and client to help the client. However, in forensic situations, the clinician often assumes a neutral role that may be perceived as adversarial by the defendant – i.e., the client is being tested to deter- mine insanity, not necessarily because the client is to be helped, but because such an evaluation is mandated by the court or the legal proceedings.
Lack of Insight
Three issues of concern –
1. Motivation to distort the results in a particular way, such as faking mental illness or attempting to look more positive than one really is
2. Random responding in a conscious effort to sabotage the testing situation
3. Inaccurate reporting of one’s abilities, beliefs, etc. through lack of insight
Various issues
- Content vs Style – what a person says or does versus how a person acts. We can distinguish, at least theoretically, between what a person says or does (content) and how a person acts (style)
- Set vs Style – responding in a way to create an image versus a tendency to select a response category
distinguished between “set,” which refers to a conscious or unconscious desire to respond in such a way as to create a certain image (e.g., to fake good) vs. “style,” which refers to a tendency to select some response category a disproportionate amount of the time, independent of the item content - Is response set error? Is this a meaningful dimension to assess
Some writers think that response sets in fact represent meaningful dimensions of behavior to be assessed (e.g., Cronbach, 1946; 1950), while others think that response sets need to be corrected for or eliminated from test - Faking is more than faking – can provide information that is useful in itself
In many instances, scales to assess faking also can yield valuable information in their own right. For example, on the CPI the three validity scales can also be useful in interpreting the individual’s personality structure and dynamics. As an example, in high-school males (but not females) random answering on the CPI is related to a lower probability of going on to college, a lower GPA, and a greater likelihood of being perceived as delinquent
Faking good and faking bad
Faking good
* Composed of two components: “self-deceptive enhancement” and “impression management”
* More difficult to detect
* More about virtue and honesty
* Want to show have good adjustment and mental health
Faking bad
* Over-endorse symptoms
* Endorse specific symptoms that they think make them look mentally ill
* Appear poorly adjusted or mentally ill
Ways to distort personality tests
- Deliberate faking
- Idealized presentation of oneself as opposed to realistic presentation
- Inaccurate presentation because of lack of insight
Some psychometric issues
Scale-development strategies
1. One group – one instruction
- Calculate the endorsement rate for an item when given to a normal sample with standard instructions
- Find items with low endorsements
- High scorers on those items are making unusual claims
Some psychometric issues
Scale-development strategies
2. One group – two instructions
- A sample takes a measure using standard instructions and then again using faking instructions (e.g., generic or specific)
- Items are retained if they show a significant response shift
Some psychometric issues
Scale-development strategies
3. Two group – two instructions
- Psychiatric patients take the measure using standard
instructions - A normal sample takes the measure and is instructed to fake their answers as if they were a psychiatric patient
- Find items with differential endorsement
Some psychometric issues
Faking psychopathology (type I and II)
- Type I items
Have predictive validity, but low or no face validity
May be endorsed by the psychiatric group, but not by
malingerers
Example: “subtle” MMPI items - Type 2 items
Have face validity, but no predictive validity
Endorsed by malingerers, but not by patients
Example: “visual hallucinations”
Some psychometric issues
Suppressor Variable
There are different techniques and methods which have been used to detect faking, but none seem to change the validity
Research can develop generic or scales which are specific to the content area
Scales used as a correction
Suppressor variable – removes the variance that is assumed to be irrelevant between a predictor and a criterion
Usually the suppressor is associated with the predictor, and not with the criterion
Research has found that it may not help
Some psychometric issues
Stylistic scales
Oblong trapezoids – items assess beliefs about things that can’t be true
How useful are stylistic scales, that is, personality scales that attempt to mea- sure one’s personality style, such as impulsivity?
Stylistic scales
* Research using the CPI didn’t differ in overall validity
* Those scales were the least valid and were not useful predictors
* Using as corrections didn’t increase validity
* Might be used as a moderator, but didn’t work for the CPI
Techniques to Discourage Faking
Intentional distortion
1. Instructions or warnings that distortion can be detected and/or punishment will follow.
2. Use of forced-choice items that are equated on social desirability.
3. The use of subtle vs. obvious items.
4. Use of validity scales.
Disguised titles – simply change the title of the measure to something generic
Filler items – add extra items to the measure to disguise the purpose (not scored)
Forced-choice format
* Lowers reliability – if it lowers too much, then validity will suffer
* May increase construct validity
Developing faking scales
* Approaches
Empirical – differential endorsement
Rational – items selected based on content
Cross-cultural perspective
* Some cultures may be more likely to endorse extreme responses or more positive responses (e.g., Hispanics)
Symptom validity testing
* Present repeated two-alternative, forced-choice, discrimination problems
* If a person has no knowledge, they should answer at a chance level
* If a person tries to answer incorrectly, then their score would be below chance
Lin, Y. (2021). Using reliabilities reliably
Using reliabilities reliably – A deeper look at the reliability of forced-choice assessment scores.
The main findings
The study confirmed that the different estimation methods do not converge in values, even when they are applied to the exact same assessment. Therefore, there
are three things that researchers and practitioners need to keep in mind when working with forced-choice reliabilities:
1.When reporting reliabilities, it is essential to specify the estimation method used.
2.When interpreting reliability estimates, it is important to consider the assumptions and limitations of the estimation method used.
3.When comparing the reliability of scores from different forced-choice assessments, the reliability estimation method should be kept constant.
Related Issues
Does format alter scores?
- Should items be grouped together according to scales or randomly ordered?
- Possible outcomes
Random
Reduce social desirability
If a subject is answering a questionnaire whose intent is not clear, there may be a lack of trust and less motivation to answer honestly.
Creates a higher intellectual demand to go back and forth in content
No impact on discriminant validity
Grouped
Increased internal consistency reliability
Related Issues
Positional response bias
* Mixed results
* Do you have a tendency to think that one option will be correct more often?
Use if discriminant functions
* Research has found some good results in separating the faking and not faking groups
Dissimulation about neuroticism
* Some people have stereotypes about some variables
* Research found that could differentiate between groups on neuroticism
Can clinicians detect faking?
* Research is mixed
* Book concludes that they are not very good at detecting faking
How important are response sets?
* Response sets – social desirability, acquiescence
* Research has not found much of an effect
* Some have stated that concern for response sets may be overemphasized and not warranted
Some criticisms
* Most studies of these issues use college students
* Need research use more realistic groups who have more of a reason to fake
How effective are instructions to fake?
* People seem to be able to do this
Social Desirability and Assessment Issues
define
“the tendency to subjects to respond to personality-test items in a
manner that consistently presents the self in a favorable light”
(Domino & Domino, 2006, p. 444).
Social Desirability and Assessment Issues
To determine the level of social desirability for a personality item
- Administer a pool of items
- Ask judges to evaluate the social desirability of an item on a scale
(e.g., 1 – 9). These ratings are highly reliable - Calculate the mean rating
- Administer items to a second group with standard instructions
- Compare the proportion of people who endorsed an item
- The proportion of endorsement was correlated with social desirability
Social Desirability and Assessment Issues
Meaning of social desirability
Has been interpreted in two ways:
- Contaminant
May invalidate responses to the measure
May be able to compute the amount of variability attributed to it
can be consider contamination only when social desirability is not related to the construct of interest
individuals scoring high on a social-desirability scale are assumed to be faking good, and therefore their test scores on the other scales are considered invalid. Thus, self-report scales that correlate highly with social desirability scales are considered invalid
- Meaningful personality dimension
Correlates with a variety of behaviors
Not a response set, but a personality trait
Social Desirability and Assessment Issues
Individual Differences
There are items which differ on their level of social desirability
Some people will score higher on some variables (e.g., adjustment, conscientiousness)
How can we determine if a person is faking?
* Compare to objective evidence (difficult to obtain)
“I’m a good swimmer” watch them swim
* Assume high endorsement of socially desirable items is malingering
* Research has not found these scales measure an individual difference
Social Desirability and Assessment Issues
Scales of social desirability
- Edwards scale
39 MMPI items
Unanimously chosen by 10 judges as socially desirable and correlated with the total score of the scale - Marlowe-Crowne SD scale
“culturally approved”
33 true-false items
Describe culturally approved behaviors with low probability of occurrence - Jackson’s social desirability scale
it assesses the tendency to describe oneself in desirable terms and to present one- self favorably
From the Personality Research Form (PRF)
20 nonpsychopathological items
Describe oneself in favorable manner - The three scales do NOT intercorrelate highly
- Edwards and Jackson measure “a sense of own general capability”
- Marlowe-Crown also measures “interpersonal sensitivity”
Contains a “self” and “another” component
Social Desirability and Assessment Issues
Components of social desirability
- There seems to be two dimensions
Self-deception – respondent believes the positive self-report
Impression management – respondent is consciously faking
Social Desirability and Assessment Issues
Reducing social desirability
Five basic suggestions:
1. Use a forced-choice format
2. Use items that are neutral in social desirability
3. Create a situation where the respondent believes faking can be detected (“bogus pipeline”
4. Use a lie scale
5. Ignore the issue
Test Anxiety
A person feels apprehensive and worried when taking a test that is evaluative.
It may lower a person’s score on that test
Several measures were developed to try and measure this concept
* Taylor Manifest Anxiety Scale – composed of MMPI items
* Test Anxiety Questionnaire
* Test Anxiety Scale
* Test Anxiety Scale for Children (TASC)
Five characteristics of test anxiety:
1. The test situation is seen as difficult and threatening
2. The person sees themselves as ineffective to cope with the test
3. The person focuses on the undesirable consequences of being personally inadequate
4. Self-deprecation interferes with possible solutions
5. The person expects and anticipates failure and loss of regard by others
Testwiseness
“Person’s ability to use the characteristics and format of a test or test situation, to obtain a higher score independent of the knowledge that the person has” (Domino & Domino, 2006, p. 457).
Some people have test-taking skills
* Not a general trait
* Not related to intelligence
* Clue-specific
One study found it accounted for 16% of the variance in test scores