Part 2 (Final) Flashcards

Question

Briefly describe Test-retest

Answer 1

Optimal, when possible Require twice the resources only really good for traits, not so good for states

Answer 2

Even less common Requires two identical measures make two different versions of the questionnaire that measure exactly the same thing. Alternate who gets A and B, and then assess the second version later and the higher the correlation, the more reliable

Answer 3

Newest form No tau or error equivalent is required (true score + error) based on congeneric model (loose model requiring a single latent variable) harder to violate it’s assumption, but harder to calculate, which is fine with jamovi more than one calculation for omega

Answer 4

Based on Essential Tau-equivalence model, depending on how it’s calculated, so is built on several key assumptions 1. There is a single latent variable 2. Error is always radom if not true, will affect how we believe our questionnaire to be reliable 3. True score influences items equivalently 4. Inter-item correlations are equivalent 5. All inter-items correlations would be equal in a large enough sample in an infinitely large sample 6. Items would have equal variability in a large enough sample 7. More items will estimate true score and error better

Answer 5

Like correlation coefficient that it is between -1 to 1, but you do not want to see negative For research purposes, a good alpha is a = .70 or better For other purposes, a good alpha is a = .80 r better (and it may need to be even higher for diagnosis) We won’t hit 1 since we are going to violate the assumption to some extent Can be thought of as the proportion of true score that made its way through the measure if a = 0.8 → 80% true score is captured by the measure

Answer 6

Either high inter-item correlation OR large correction (large sample size) Alpha tends to increase when you add more items of equivalent value

Answer 7

while it might seem like good reliability can be achieved simply by making very long questionnaires, this isn’t really true Remember, when we make our questionnaires too long, people lose interest in our questions and that changes the average inter-item correlation as k increase, rbar most likely will decrease

Answer 8

You can never forget that reliability is NOT a property of a measure itself How much true score is captured by a measure depends on the sample of people who completed it some samples may interpret items differently, affecting error It’s not sufficient to simply quote a previously published reliability statistic - you can start from there, but verify for your sample generational understanding of the wording can affect you sample, people change across time thus affecting reliability

Answer 9

When alpha is lower than we would like, or even if it’s not, you can consider improving it by looking at whether to drop bad items Whenever an item is not really ‘equivalent’ to the others, as alpha assumes, then dropping it will change alpha if it has a low inter-item correlation, alpha goes up if it has a high inter-item correlation, alpha goes down But a questionnaire with more “bad” items can still be better overall than one with a small number of ‘good’ items

Answer 10

McDonald’s Omega is a better choice in almost every case (possibly all) and should be the go-to reliability statistic from now on Base on the congeneric model; easier to meet restrictions 1. interpretation is the same as for alpha (easy replacement) 2. not all items need to be positively correlated (but still a good idea) 3. not all correlations need to be equivalent 4. when none of alpha’s assumptions are violated, it gives the exact same results; when they are, it is less of an underestimation

Answer 11

the state of being valid; the degree to which we are measuring whatever it was that we wanted to measure like reliability, this isn’t simply a property of the measure itself for a measure to have validity, we must consider whether it’s being used in the correct context must be sure we are using it with the correct type of people

Answer 12

Content validity: key issue is deciding what should be included Criterion validity: key issue is deciding whether the measure ‘works’ usually applied in a predictive sense, can I predict the outcome Construct validity: key issue is complex Content and construct validity are always important, criterion validity only in certain context

Answer 13

This form is easy to understand, but hard to be sure you’ve achieved it to be sure you’ve represented the full range of possible content, you need to have an excellent understanding of what needs to be asked what are all the key thoughts, behaviours, skills, etc. have you represented their relative importance properly? how much of each aspect goes into the total score for the measure? Example: the midterms for this course try to address the full range of important topics discussed

Answer 14

Successful representation of all the critical aspects of the construct is usually determined by two things adherence with prior literature review expert review of your measure

Answer 15

This form requires you to evaluate how well scores on your measure watch with an accepted ‘gold standard’ or tangible outcome assessment. For example delinquency assessment: correlate your measure with the number of offences committed job suitability: correlate your measure with ratings of job performance graduate records exam: correlate with grad school GPA

Answer 16

Two separate definitions An overall category that encompassess all other forms of validity Assessing the relation of your measure with other theoretically-relevant measures

Answer 17

convergent validity: do you have expected relations to theoretically related constructs? discriminant validity: do you have the expected lack of relation to theoretically unrelated constructs?

Answer 18

To properly establish convergent validity, you need to be sure the related constructs do not themselves contain facets of the construct you’re measuring. For example: Mindfulness should show a positive correlation with the Openness to Experience sub construct of the Big5 personality measure a a negative correlation with neuroticism ADHD should show positive correlations with Memory Failures, and negative correlations with Mindfulness convergent means a relationship: either positive or negative while discriminant should show NO relationship

Answer 19

to properly establish discriminant validity, you need to have good reasons for choosing the potentially unrelated (or very weakly related) constructs, For example Intelligence is only weakly related to Artistic Ability - this is a theoretically relevant, not random choice Memory Failures show no relation to Internal Locus of Control - this could be important as both LOC and MF are typically related to Depression

Answer 20

The overlooked middle child (by psychometricians) of validity addresses the question of whether the measure seems to measure the right thing It is critical to at least consider whether there are other reasonable interpretations Outside of psychometrics, this form is frequently used as it’s the easiest to claim and not based on correlation The basic idea is: any reasonable person would say you measured the right thing

Answer 21

Strongly related to the specific ways of establishing validity. It’s simply the extent to which you can account for other plausible explanations You need to either rule out alternatives, or somehow argue they’re not plausible after all Error, particularly biased responding, will again be concern as it increases the likelihood of threats to internal validity

Answer 22

1. order effects: practice, fatigue, boredom, context effects, etc. 2. motivation: we could have incomplete data, change context effects can lead a to more response set 3. distraction: similar issues to a lack of motivation 4. sampling bias: when we haven’t sampled the right people 5. maturation: mainly problematic for predictive criterion validity 6. instrumentation: when our questions get less useful over time

Answer 23

some of the common sources of systematic error that affect only validity Reliability is concerned with random error, while validity is also concerned with systematic error

Answer 24

Aside from face validity, we use correlations between our measure and another measure to quantify and demonstrate validity We need to anticipate more modest correlations for validity than we would for reliability, however the expected size should be based on prior literature and theory an r up to .10 is good for showing discriminant validity an r of about .60 is very good for showing convergent validity. between .2 and .7 are usually cutoffs an r of about .85 or higher is usually very bad for showing convergent validity, too close to 1 and would be hard to say they are actually two different concepts

Answer 25

convergent validity you are likely to over- or underestimate the correlation discriminant validity you are likely to get a higher correlation than you would want

Answer 26

all of the items are reverse scored The MAAS items capture only one part of the full construct of mindfulness

Answer 27

More often than not, we have much less true score than error

Answer 28

Ideally, you need enough true score (signal) to overcome the randomness of error (noise)

Answer 29

also called essentially equivalent to true score model

Answer 30

there are many ways to calculte it. Often uses essentially tau-equivalent model, but jamovi uses a mix with parallel test model

Answer 31

from left to right Negatively skewed: mean median mode Positively skewed: mode, median, mean

Answer 32

High levels of skew is a source of error that will impact the accuracy of our correlations

Answer 33

It’s possible to calculate a big positive or negative kurtosis measure without a normally shaped distribution

Answer 34

flat distribution

Answer 35

too skinny distribution

Answer 36

internal validity

Part 2 (Final) Flashcards

(60 cards)