Chapter 5 - Measurement Flashcards
Which variables are measured in non-experimental vs. experimental designs?
- non-experimental: all variables measures
- experimental: DV measured only
types of scales
- qualitative variable
- quantitative variable
qualitative variable
- things with no numerical values (words, concepts, communication, etc.)
- ex. Nominal scales
nominal scales
assigned numerical values are meaningless (ex. Gender, sexuality, religious affiliation)
quantitative variable
- things with meaningful numbers that can be placed on numerical scales
- ex. Ordinal scale, Interval scale, Ratio scales
ordinal scales
when things have a meaningful order (ex. Socioeconomic status, ranking a set of pictures)
interval scales
variables indicate order and the difference between each number is equivalent (ex. Temperature)
ratio scales
have all the characteristics of ordinal and interval scales, but the 0 is meaningful because it indicates a complete absence of something, so you can then take the difference between numbers and create a ratio (ex. Using Kelvin to measure energy – 0 degrees Kelvin is an asolute absence of all heat/energy… it’s equivalent to ~-275 degrees celcius)
4 things to consider when constructing a measure
- Reliability: Does it measure the construct with little error? Is it a stable measure? (True score + error = person’s combined score. We want to minize the effect of the error)
- Construct validity: Are we measuring what we think we’re measuring?
- Internal validity: Can we infer causality?
- External Validity: Can we generalize our findings beyond this group and setting?
3 types of reliability
- test-retest reliability
- inter-rater reliability
- internal consistency reliability
test-retest reliability
- how consistent is the measure across time?
- Evaluated using the Pearson correlation coefficient (r)
- The largest positive correlation indicates a higher test-retest reliability
inter-rater reliability
- how consistent is the measure when different people are rating it?
- Evaluated using the Intraclass correlation coefficient (ICC) and/or Cohen’s Kappa
- The ICC only goes from 0-1 (no negative values) with 1 being complete agreement and 0 being no agreement
- Other uses of inter-rater reliability: Behavioural coding, thematic/content coding, used when you ask an open-ended question to figure out the overarching themes in your responses, archival research, personalty measures
internal consistency reliability
- Are scores similar across different questions?
- Evaluated using Cronbach’s alpha (a) combining inter-item correlations
- 0.9 is ideal, between 0.6 to 0.9 is good, less than 0.6 isn’t very good
- Evaluated using item-total correlation
- Used on multiple choice exams
construct validity
- Does content of measure reflect the meaning of the construct? (ex. face validity, content validity)
- How does this measure relate to other measures and behaviours? (ex. predictive validity, concurrent validity, convergent validity, discriminant validity)
face validity
- Look at each item
- Does it look like it’s assessing risk-taking?
- Usually happens, not not a requirement of measures