Midterm 2 - Chpt. 5 Flashcards
Concepts (in Measurement):
- A concrete way to measure an abstract concept
Quality of operational definitions evaluated by (2):
- Reliability
– Is your measure consistent? - Construct Validity
– Are you measuring what you hope you’re measuring?
– Accuracy
How are concepts helpful?
Ways to evaluate operational definitions
Especially measurement instruments (e.g., scales, surveys, coding schemes)
Check that reliability & validity have been demonstrated
Which kinds of designs are typically most concerned with demonstrating reliability & validity?
A) Correlation/Surveys/quasi-experiments
B) Experimental designs
4 components to evaluating operational definitions:
- Reliability
- Construct Validity
- Internal Validity
- External Validity
Reliability:
- Does it measure the construct with little error?
- Is it a stable & consistent measure?
Construct
Are we measuring what we think we’re measuring?
Internal Validity
Can we infer causality?
External Validity
Can we generalize our findings beyond this group setting
Reliability - True Scores
Each participant has a true score
₋ That’s the target, but we can’t observe it
Must rely on measurement which has “measurement error” (Deviation from target)
A measure is considered reliable if it has relatively little measurement error
What’s the first concern with any measure?
Reliability is your first concern with any measure
If it isn’t measuring a thing consistently, then validity (accuracy) is not even an issue
Types of Reliability:
- Test-retest reliability
- Internal consistency reliability
- Inter-rater reliability
Test-Retest Reliability
Is a Participant’s score consistent across time
- EX: an extrovert; should be socializing, not staying in
Positive linear relationship/correlation
- Rule of thumb: min r =+ 0.80
*For relatively stable constructs
Internal Consistency Reliability
Is a P’s score on this construct similar across items that are aimed to measure related aspects of the construct
- Items = questions (“is talkative”, “is full of energy”, “is rarely shy”)
From text:
- Split-half
- Cronbach
- Item-total
Inter-rater Reliability:
How similar are a participant’s score when measured by different raters?
Relevant when behaviour is observed or texts are coded by multiple “raters”
Validity
- Are you measuring what you hope you’re measuring
- Accuracy
- Is it measuring what ‘it’ is supposed t
Components of Construct Validity
- Face Validity
- Content Validity
Face Validity
Look at each item.
Does it look like it’s assessing loneliness?
- If yes, then high face validity
Usually happens, but not a requirement of measures.
Alternative to FV:
– Give a whole bunch of items to a large group, see what predicts loneliness (don’t care Why)
Content Validity
Look at the whole measure.
Is it capturing all the important parts of what it means to be lonely? And nothing more
Theoretical question
Can be debated!
Predictive Validity
Predicts future, conceptually related behaviours
- Q: Do people with high scores on your measure at T1 go on to do relevant behaviours at T2?
Concurrent Validity
Able to distinguish between theoretically relevant behaviours
- Q: Do people with high scores on your measure behave in ways you’d expect them to behave if they were high on this construct?
- Constructs that are supposed to be related, ARE related
Types of Construct Validity - Behaviours
- Predictive
- Concurrent
Types of Construct Validity - Other Constructs
- Convergent
- Discriminant
Convergent Validity
Related to scores on measures of similar constructs
- Q: Do people with high scores on your measure have high (or low) scores on measures of related constructs (ie high correlation)?
- Do your measurements of happiness compare to other known studies of happiness?