Chapter 3 Flashcards
Types of scale
Reliability
Validity
Three ways to measure behavior
Observational
Physiological
Self-Report
Term scale refer to
- How a participant responded on a particular question (e.g. How much do you enjoy reading)
and Several questions that measure the same idea- each question is used to give you a total sore extraversion(multi-item questionnaire use)
example: How much do you like reading? How much do you like reading fictional books?
- each question is used to give you a total sore extraversion(multi-item questionnaire use)
- Types of scale
Different types of scales that correspond to these questions
Nominal
Ordinal
Interval
Ratio
Nominal (name) Scale
answers to questions that relate to performance or characteristic with no quantitative values Examples: What is your hair color? Where do you live? Are you a coffee drinker? Do you own a computer?
What does nominal data tell us??
1.Rates (incidences) of responses
Example: 52% of my sample reported to be coffee drinkers
2. Able to see if their are differences within another measure based on their nominal responses
Example: Do VCU students report to be more of a coffer drinker than JMU students?
1=VCU students
2=JMU students
Ordinal (order) Scales
responses tells us a correlative ranking order..(rank) Examples: How do you feel today? 1-very unhappy 2-unhappy 3-ok 4-happy 5-very happy ** doesn't tell us how much difference there was, only which one is larger
Interval (space in-between) Scale
Are numeric scales that tells is the order and difference between the set of values. (scores)
*No true zero** zero doesn’t mean the absents of something
Examples: Temperature
The difference between 40 ad 50 degrees is a measurable 10 degrees
Other example:
-Time
-Dates
-Sea levels
ratio
Includes everything: (numbers)
- meaningful distance between numbers
- true zero point-zero means nothing
- Numbers correspond to numbers and not labels
Examples of ratio
weight
test scores
income level
height
Reliability and Validity
Important to study due to error variance
helps us determine how much we should trust our measures
Error variance (4-main Causes)
- Individual differences
- situational factors
- room tempt
- mood of participants
- experimenter’s personality
- Characteristics of measures
- understanding of the question
- Reactivity
- mistakes
- coding answers
- Distractions (random counting such as taps or eye blinking and sneezing)
Reliability
refers to whether we get similar answers every time we measure
Remember lucky 7 (and one 3)
Reliability of a measure
Consistency of a measuring technique
Reliability= systematic variance/total variance
70% or greater is considered reliable
Correlation coefficient (type of effect size)
-ranges from 0 to 1
-can be + or -
-higher the value the more the two variables are related
**Squaring the coefficient gives us the proportion of total variance that is systematically related to the measurement.
Lucky 7’s (and one 3)
Systematic Variance ≥ 70%
Total Variance
2. A person is tested multiple times and the correlation between the scores is > .7
3. Item total correlation is at least .3 or higher with the other items
4. Cronbach’s alpha is at least .7
Types of reliability
test-retest
inter-item
inter-rater
Test-retest reliability
means that a person should score about the same each time they are measured
How to calculate test-retest reliability: measure the person on two occasions and look at the correlation’s between the two scores
**Good test-retest would be a correlation of ≥.7
Interitem reliability
- refers to how consistent the questions (times) are to each other
- use when there is a multi-item questionnaire to help us measure the behavior and characteristics
- where we average all the question responses to obtain a single score
Calculating Interitem reliability
look at the the item total correlations
- items should correlate moderately if same measures are used.
- item needs to be .3 or higher with the other items
- Also, uses Cronbach’s alpha coefficient- Needs to be at least .7
Interrater reliability
- used when we use people to observe people and code behavior
- amount of agreement between other coders
- want the level of agreement to be high
Increasing reliability
- measure participants int he same environment
- make question clear
- train observer judges
- minimize coding errors
Validity
measuring what we want to measure
Example: we want to know a foreign person’s ability to speak english we would give the a TOEFL test.
Types of Validity
Face validity
Construct validity
Criterion-related validity
Face Validity
it looks like we are measuring what we want.
Example: how much do you prefer iPhones over galaxy?
Problem: social desirability
Construct Validity
observes whether one measure correlates to other measures.
(established by looking at numerous studies that use the test being evaluated.)
Example: What makes person happy?
Measures with correlation coefficients
convergent validity
divergent validity
convergent validity
Should have high correlation with other measures that are similar
Example:If I said that friendliness was highly correlated with lots of volunteer time, that would make sense,
divergent validity
Should have low correlation with items that are different
Example:If friendliness was highly correlated to how many times that person ate pizza, that wouldn’t make sense…it should be uncorrelated
Criterion-related validity
let us know if we can predict a behavioral outcome from a measure
Example: GRE scores predict whether you will do well in grad school
Criterion-Related Validity Example
- let’s say we measure someone’s physical fitness and find that they are in good physical condition.
- What might we predict in the short run? In the long run?
Criterion-Related Validity Example
Short Term (concurrent validity)
Can probably run a mile faster than average
Can do a push-ups and/or pull ups
Can bench press at least ½ their body weight
Long Term (predictive validity) Less likely to have health problems Fewer visits to the doctor Less chance of type 2 diabetes