Biais, Échelles de validité Méthodes de construction d’instrument de mesure Types d’items et échelles de réponses Traduction / adaptation transculturelle Flashcards

1
Q

Biais des tests

A

Une chose très importante est de ne pas confondre différence de moyenne entre des groups et biais

-The public sometimes has the impression that all assessment instruments are biased (e.g., by age, by sex/gender, by ethnic group, by clinical group, etc.).
-This is sometimes the case and it is the duty of the test user to be aware of it. devoir de l’utilisateur.trice
Reminder:
Bias = systematic error, is not random
Biais = de l’erreur systématique, n’est pas aléatoire

One very important thing is not to confuse difference in means between groups with bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Biais des tests

A

Differences in means between certain groups are not a priori a bias since some are theoretically/conceptually expected
e.g., In adolescence, few or no differences in means between ethnic groups for behavior problems, but differences by sex/gender
e.g., In adulthood, presence of sex differences in some personality traits, but few or none in adolescence
normative: compared to mass population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

biase de test - Un instrument d’évaluation est biaisé

A

Un instrument d’évaluation est biaisé «si les differences entre les membres de différents groupes sont identifiées sur la base de caractéristiques autres que celles que l’instrument prétend évaluer» (Merrell, 2008; Whitcomb, 2017)
Autrement dit, il y a présence de biais pour un instrument si le contenu, la procédure ou l’utilisation favorise ou défavorise systématiquement les membres d’un groupe plutôt qu’un autre et si cette différenciation est non pertinente à l’objectif de l’instrument

An assessment instrument is biased “if differences between members of different groups are identified on the basis of characteristics other than those the instrument purports to assess” (Merrell, 2008; Whitcomb, 2017)

In other words, bias is present for an instrument if the content, procedure, or use systematically favors or disfavors members of one group over another and if this differentiation is irrelevant to the purpose of the instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how is reliability affected directly

A

As we have seen, the fidelity of scores on an assessment instrument can be compromised by various sources of measurement error
-We have also seen that the inferences and interpretations permitted with scores provided by an assessment instrument are dependent on the degree of validity of those scores.
-Validity can be affected directly by (a) response bias on individual items or
(b) scale score biasThe presence of bias is a critical issue for both test developers and test users
(a) biais de réponse aux items individuels ou par des
(b) biais des scores à une echelle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Biais de réponse: heuristiques ou biais cognitifs

A

People who are being assessed and asked questions, whether about themselves or as an informant for a third party, are always at risk of being partially biasedFor example, in a job interview where a person has to answer a personality questionnaire, would they want to look their best? Or even better than their best?Even at a basic level, it is now recognized that the human cognitive system is “victimized” by several heuristics or cognitive biases (Kahneman, 2011; Kahneman, Slovic & Tversky, 1982)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Heuristiques associés aux styles de réponses

A

Heuristiques: Stratégies cognitives utilisées pour simplifier et accélérer une décision en situation d’incertitude (Kahneman, 2011).
Heuristics: Cognitive strategies used to simplify and speed up a decision under uncertainty (Kahneman, 2011)Sometimes referred to as “mental shortcuts.”
Apply to behavioral evaluation/estimationVery useful when one does not know a person to be evaluated well enough.
Can also lead to misjudgment and “stereotyping” of people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Quatre exemples connus d’heuristiques

A
  1. Heuristique de la représentativité
    Representativeness heuristicEvaluating a specific characteristic in terms of how well it matches a prototype (e.g., evaluating a child’s attention based on our ADHD prototype)
  2. Heuristique de la disponibilité
    Availability HeuristicRating that is influenced by the things that come most easily (or frequently) to mind for the rater (e.g., children’s aggressive behaviors)Those things that come to mind more easily are considered more frequent and more representative of reality
  3. Heuristiques de primauté / de récence
    Primacy / recency heuristics
    Evaluation that is influenced by the individual’s first vs. last impression
  4. Heuristique de l’affect
    Affect heuristics
    Assessment colored by current emotional and affective state (e.g., bad mood leads to estimation of more behavior problems)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

influencent directement la validité

A

Response biases may seem trivial, but they can be very serious as they directly influence the validity of test scores
Diminished” validity can in turn compromise the quality of inferences and clinical decisions that are made about an individual (or group) being assessed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Huit grands types de biais de réponse (see pictures )

A

1.Extrémité: responds are extreme
2. Indécision: neutral response
3. Acquiescement: say yes to everything
4.Objection: always say no
5. Désirabilité sociale: socially acceptable exaggerate the positive
6. Gestion défavorable des impressions (malingering): answer exageratevily negative
7. Réponse aléotoire ou negligent:random
8.Deviner (guessing):
9. halo

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Que faire pour prévenir ou minimiserles biais de réponse ?

A

Three things to do:
1. Manage the assessment situation
Anonymity, minimize frustration, give warnings (i.e., warn that there are validity scales)
2. Manage the content of the tests
Simple items (language level), content-neutral items (i.e., non-suggestive), conceptually clear response options
3. Specialized validity tests or scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Quelques exemples d’échelles de validité

A

Toutes ces échelles sont basées sur le même principe : des scores très élevés ou extrêmes suggèrent un problème potential
All these scales are based on the same principle: very high or extreme scores suggest a potential problem

Indeterminacy scale (e.g., the MMPI-2; Ben-Porath & Tellegen, 2008)
The full MMPI-2 questionnaire has over 567 items
Unanswered items, or items with multiple responses on the same item, are summed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Échelles de validité- échelles de désirabilité sociale

A
  1. Échelles de désirabilité sociale
    Échelle de désirabilité sociale de Marlowe-Crowne
    Marlowe-Crowne Social Desirability Scale (Crowne & Marlow, 1960)e.g., “I never lie”; “I like everyone I know”; “I have never been angry”.
  2. Inventaire balance de style de réponse socialement desirable
    Self-Deception: generally honest, but overly positive responsesImpression management: dishonest responses, positive bias is used to (a) please others or (b) gain advantage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Échelles de validité - Échelle de gestion dévavorable des impressions

A

Échelle de gestion dévavorable des impressions
Unfavorable impression management scale (e.g., the MMPI-2; Ben-Porath & Tellegen, 2008)
Tendency to respond positively to unlikely negative items (e.g., “I’m no good at anything”; “I have no talent”)
Difficult to distinguish effect with severe clinical cases (e.g., major depression or depressive personality disorder, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Échelles de validité -
- Échelle de style de réponse extreme
-Échelle d’indécision

A
  1. Extreme Response Style Scale
    Criteria proposed by the EDC (Parent et al., 2006)
    i.e., choosing the 1st or 7th choice of items an abnormally high number of times
  2. Indecision scale
    Criteria proposed by the EDC (Parent et al., 2006)
    i.e., choosing an abnormally high number of times the central category, i.e., the 4th choice (the one in the middle) of the items
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Échelles de validité -
Incohérence variable des réponses (VRIN)

A
  1. Variable response inconsistency (VRIN)Sum of the number of item pairs that were answered inconsistentlySimilar: “I don’t think before I act” - “I act without thinking about the consequences”Different: “I don’t think before I act” - “I think carefully before I make decisionsWe give 1 pt for each inconsistent pair and calculate a sumUsed to detect random responses réponses aléatoires (intentional or not) or confusion in a questionnaire
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Échelles de validité -
Incohérence vraie des réponses (TRIN)

A

True response inconsistency (TRIN)
In this one, only pairs of items that are conceptually different are used
Calculates a sum of the inconsistently true response item pairs minus the sum of the inconsistently false response item pairs
Used to detect inconsistent responses that indicate acquiescence l’acquiescement (very high score) or objection (very low score, possibly negative)

17
Q

Biais des items et tests

A

une fois que le niveau du trait est contrôlé
Aussi appelé «fonctionnement différentiel des items»
Item (or indicator) biasNot differences in scores on the trait, but systematic differences in the probability of responding in a given way for each item individually, once the trait level is controlled forAlso called “differential item functioning.”Compares the probability of endorsing items on a scale of individuals in different groups who have the same score/level on the traitSame principle as control variables in predictive studies (e.g., when “controlling for SES”)

18
Q

Biais des items et tests - Biais structurel

A

Biais structurel
Pour un instrument unidimensionnel, il peut s’agir de différences significatives des saturations factorielles entre deux groups
Pas banal puisque ceci signifie que le trait n’est pas mesuré de la même façon dans différents groupes
Pour un instrument multidimensionnel, (a) différences des saturations et (b) la structure factorielle n’est pas la même dans différents groupes
e.g., analyse factorielle révèle 3 facteurs pour les hommes, mais seulement deux pour les femmes

19
Q

Biais des items et tests -Biais critériel (ou critérié)

A

Criterion (or criterion-referenced) biasApplies to both concurrent criterion validity (independent criteria and contrasting groups) and predictive validitye.g., A temperamental trait that predicts later adjustment for one group of children, but not for anothere.g., an IQ test predicts success for one cultural group, but not for anotherCaution: the observation of differences between groups for predictive relationships can be expected because this is theoretically justified… it is not a bias then

20
Q

Biais des items et tests -Fidelity bias

A

Fidelity bias
Fidelity estimates are significantly different in different groups
Can be potentially important for interpretation
if bias is present, the level of confidence one can have in the scale scores varies across groups
observed group differences in means can then be partly explained by error

21
Q

Biais des items et tests

A

Although testing by comparing groups by sex/gender, ethnicity, cultural background, clinical group, etc., can be informative for many researchers, it often results in “over-generalization”«sur-généralisation»
Variation between individuals in the same group (intragroup variance) can be enormous (see figure distributions)As a psychoeducator, one must never lose sight of the fact that the purpose of a psychoeducational assessment is to interpret the scores and make recommendations for ONE particular individualpour UN individu particulier

22
Q

Méthodes de construction des tests

A

There are a wide variety of tests useful in psychoeducation (Hogan et al., 2017)
Tests of intellectual ability/cognitive skills
Achievement tests
Neuropsychological tests
Measures of personality/temperament
Measures of interests, attitudes, and values
Measures of psychopathology
One major category is often overlooked in psychology books
Measures of environmental constructs

23
Q

CONSTRUCTION OF TESTS

A

In general, professional organizations expect authors to have constructed their instrument in accordance with the criteria listed in the Standards for Testing in Education and Psychology (AERA, APA, & NCME, 2014)
Test construction and validation is a long-term process
Requires revisions before it is fully satisfactory
Can take place over several years, even a few decades

24
Q

Deux grandes méthodes de construction des tests

A

Deductive (or rational)
“conclude from propositions taken as premises”.
Inductive (or empirical)
“conclude by going from the facts to the law

25
Q

Deux grandes méthodes de construction des tests–Méthode deductive (rationnelle)

A

From a theoretical framework
Scientific theory
Constructs, domains, indicators (the test designers determine them according to the theory)
Clinical theory
We want to answer a practical problem or need
e.g., how to measure PES intervention? How to measure motivation to change?
Advantage: Clear theoretical context, logical consistency (i.e., nomological network often known a priori)

26
Q

Deux grandes méthodes de construction des tests–Méthode inductive (empirique)

A

Based on an empirical (or factual, or pragmatic) 1.approachItem analysis / Factor analysis: items statistically related to the construct are selected (may also include internal consistency, criterion validity, etc.)
2. Criterion-referenced selection: e.g., only items that differentiate groups are selected (e.g., MMPI-2 Antisociality scale)
However, the approach is never completely empiricalto generate items, there is always an underlying theory, even if it is implicit

27
Q

Méthode inductive (empirique) ADVANTAGES AND DISADVANTAGES

A

Advantages: Greater objectivity and more representative of reality; we verify our understanding of a construct, explicitly supported by data
Disadvantages:You don’t necessarily get what you want, the data dictate the final outcome (e.g., factor structure, etc.)e.g., data suggest that anxiety and depression items are combined into one factorStatistics can sometimes distort concepts due to sampling biase.g., statistics suggest eliminating a clinically important aspect, while discordant results are mostly the result of poor sampling, or too small a sample

28
Q

Types de questions et choix de réponse

A

There are a host of different types of items and even more different response choices, making it challenging to present and categorize them (Urbina, 2014)
Different items and possible response choices, depending on:
Type of construct being assessed
Specific uses of an instrument
Personal preferences of the authors
Questions can also be presented in several ways
verbally in an interview
visually in a paper and pencil version
visually in a computerized version (on a fixed computer, or with an application on a smart phone, a tablet)
etc.
The most basic distinction is the type of response that is asked of the person being assessed: type de réponse qui est demandé à la personne évaluée
(a) constructed response items and (b) selected response items (Urbina, 2014)

29
Q

Choix ou échelles de réponse - Items à réponses construites

A

Constructed response items
Also called “essay” or “open-ended” or “free-response” questions
A premise is presented to the test taker, but there is no constraint on a fixed answer choice
There are, however, some rules, so there are (a) long-answer and (b) short-answer open-ended questions

30
Q

Choix ou échelles de réponse–Items à réponses construites

A

Constructed response items
Also called “essay” or “open-ended” or “free-response” questions
A premise is presented to the test taker, but there is no constraint on a fixed answer choice
There are, however, some rules, so there are (a) long-answer and (b) short-answer open-ended questions

Il y a néanmoins certaines règles, ce qui fait qu’il existe des (a) questions ouvertes à réponse longue et (b) des questions ouvertes à réponse succincte

31
Q

Choix ou échelles de réponse –Items à réponses construites (suite)

A

Constructed response items (continued)
An example of a long answer would be:
“Describe your usual relationship with your child?”
An example of a short answer would be:
“Using no more than 4 or 5 words, complete the following sentence, “My usual relationship with my child is”: __________”
Constructed response questions are essential in interviews
questions à réponses construites sont essentielles en entrevue

32
Q

Choix ou échelles de réponse - Items à réponses sélectionnées

A

Selected response itemsAlso called “objective,” “forced-choice,” “multiple-choice,” “true or false” questionsA premise is presented to the respondent and he or she is placed under the cognitive constraint of a fixed response choiceThis is the most common type of item used in humanities, social sciences and psychological assessment instrumentsMore objective, easier to derive a numerical score, more reliable, shorter, etc.

33
Q

Choix ou échelles de réponse

A

When a person is asked to answer a selected-response question, he or she must perform four cognitive tasks (Tourangeau, Rips, & Rasinski, 2000)Comprehension: understanding the relevant contentRetrieval: retrieve the relevant information from memory that is needed to answerJudgment: making a judgment based on the retrieved informationResponding: reporting this judgment, based on the available response options

34
Q

Traduction / adaptation transculture

A

Principe d’un instrument standardisé suggère qu’il s’agit d’une étape qui doit être prise au sérieux
Crucial issue in QuebecVery often “home-made translations” are used, (1) without any study verifying their psychometric properties and/or (2) without collecting Quebec standardsSometimes a simple translation is insufficient, but often an adaptation is necessaryImportant: The understanding of the content, the meaning/significance of the items is more important than the exact translationPrinciple of a standardized instrument suggests that this is a step that must be taken seriously

35
Q

Six étapes de l’adaptation transculturelle

A
  1. Translate and adapt items (minimum 2 people)
  2. Method of choice: Back translation Traduction à rebours
  3. Independent experts review the translation
  4. Eliminate or adapt items according to their comments
  5. Pilot study with targeted individuals
    Empirical validation (detailed evaluation of psychometric properties)
  6. Standardization (establishing norms)
36
Q

Cinq façons d’établir l’équivalence transculturel

A
  1. Semantic equivalence
    Do items veulent dire mean the same thing in both languages/cultures?
  2. Content equivalence
    Is each item relevant (pertinent) in both languages/cultures to measure the construct?
  3. Construct equivalence
    Are the factor loadings similar? Is the factor structure the same? Is the C/D validity similar?
  4. Équivalence critériée
    Criterion-based equivalence
    Sometimes quite difficult to conclude that there is no criterion-based equivalence
    e.g., with a measure of parenting practices, a practice related to adjustment problems only in one version (or culture) is not necessarily a problem of the instrument
    5.Fidelity equivalence