chapter 3/week 3 Flashcards
the measurement of behavior
what do you start with first in the research process
research question
(concise, specific, and testable question)
variables
Concepts are converted into variables by translating or mapping them into a set of values
In experimental language there are dependent (DV) and independent (IV) variables
DV
The variable that serves as our primary focus, that we’re trying to describe, predict, or explain, is the dependent variable – denoted by “y”
IV
The variable that serves as a predictor or hypothesized cause (the variable we manipulate in an experiment) is called an independent variable, denoted by “x”
variables w regression type models
In correlational language with regression type models we label IV “predictor variable” and DV “outcome or criterion variable”
Predictors are not independent variables because they do not cause a change in the outcome variables but they can help us explain some off the variance in the outcome variable
Operational Definitions
How do you measure it?
Precisely how the concept is measured or manipulated in a study
Concrete, situation-specific, observable terms
Specificity of the construct help us better communicate what we mean in scientific communication and replication
We can operationally define a concept in many different ways
Measures used in behavioral research:
Observational measures
Physiological measures
Self-report measures
Observational measures
Involve the direct observation of behavior
Researchers can either directly observe or use audio and video recordings
Ex: depression – facial affect; content analysis of speech patterns
Self report measures
Involve people’s replies to questionnaires and interviews
Can measure:
Thoughts (cognitive self-reports)
Feelings (affective self-reports)
Actions (behavioral self-reports)
Physiological and Neuroscientific measures
Involve the measurement of internal processes that are not directly observable
Involves the use of specialized equipment to measure heart rate, brain activity, hormonal changes, and other responses
Ex: depression – laterality of EEG brain wave activity
Psychometrics
the field devoted to the study of psychological measurement
Converging operations
using several measurement approaches to measure a particular variable
Scales of Measurement definition
properties of a measure that reflect the degree to which scores obtained on that measure reflect the characteristics of real numbers
Scales of Measurement list
scales:
nominal
ordinal
interval
ratio
variable breakdown
variable –> qualitative –> nominal or ordinal
variable –> quantitative –> interval or ratio
nominal scale
the numbers that are assigned to participants’ behaviors or characteristics are essentially labels
Categorical variable
Qualitative classification
No mathematical operations
Pie chart used
Example:
Gender, marital status, blood type, favorite color, nationality
ordinal scale
involves the rank ordering of a set of scores that reflect participants’ behaviors or characteristics
The rank ordering of people’s behaviors or characteristics
The intervals between the ranks are not necessarily equal
No mathematical operations
Example:
Educational level, olympic medals, pain scale, movie reviews
interval scale
type where a smallest value does not exist, 0 is not possible, or 0 does not represent absence of quantity measure
- Equal differences: the differences between any two consecutive values is the same
- No true zero point: zero does not mean “none”
Addition and subtractions allowed, but you cannot multiply or divide
Example:
IQ scores, SAT scores, temperature
ratio scale
involves real numbers that can be added, subtracted, multiplied, and divided; Type where 0 is the smallest meaningful value, 0 can be attained, and 0 represents absence of what is being measured
- Most advanced level of measurement: intervals are equal
- There is a true zero point: zero means zero
All mathematical operations are allowed
Example:
Weight, number of errors in a test, annual income, scores in a game
Importance of scales of measurement
Determines the amount of information provided by a particular measure
Involves the kinds of statistical analyses that can be performed on the data
Measurement Error
equation
Observed score = true score + measurement error
observed score
score you found in your study/research/with your conditions
true score
the score that the participant would have obtained if the measure were perfect and were able to measure without error
measurement error
factors that distort the true score
Scores of Measurement Error
- participant transient states
- participant stable attributes
- situational factors
- characteristics of the measure
- data entry error
Participant Transient States
Temporary, unstable state of the participant
The participant may be in a bad day, with a bad mood, health, tired, or just anxious at the time of the measurement
Participant Stable Attributes
Enduring traits of the participant, such as illiteracy, paranoia, or oppositional personality
Situational Factors
Characteristics of the researcher or the lab, time, and conditions of the place
Characteristics of the Measure
Measurement fatigue – long difficult, or painful measures
Data entry error
Mistakes in recording a participant’s score
Ways to Decrease Measurement Error
reliability
validity
total variance equation
Total variance = true-score variance + error variance
reliability
the consistency or dependability of the measure
the proportion of the total variance that is associated with participants’ true scores
Reliability = true-score variance / total variance
Assessing Reliability
Researchers estimate reliability by assessing the extent to which two or more measurement of the same behavior, object, or even yield similar scores
Most common ways to measure reliability:
Test-retest reliability
Inter-item reliability (internal consistency)
Inter-rater reliability
Correlation Coefficients
Researchers usually use a correlation coefficient to make those estimates
Correlation coefficient – expresses the strength of the relationship between two measures
- Can range from -1.00 to +1.00
- Correlation of .00 indicates no relationship between the variables
The sign indicates whether the relationship between the variables is positive or negative (inverse)
Test-Retest Reliability
consistency of participants’ responses on a measure
Administer measure on two separate occasions
Exampline the correlation between the scores obtained on the two occasions
Correlation > 0.70 indicates acceptable reliability
Useful only if the attribute being measured should not change over time
Inter-item reliability
assesses the degree of consistency among the items on a scale
Tells us whether all of the items on a scale are measuring the same thing. If not, summing scores across all the items creates measurement error and lowers reliability
indices of it:
- item-total correlation
- split-half reliability
- cronbach’s alpha coefficient (α)
Item-total correlation
the correlation between a particular item and the sum of all the other items on the scale
Split-half reliability
divide the items on a scale into two sets and examine the correlation between the set
Cronbach’s alpha coefficient (α)
equivalent to the average of all possible split half reliabilities
- Most frequently used
- Adequate inter item reliability if α exceeds 0.70
- .85 – ideal
Inter-rater Reliability
the consistency among two or more researchers who observe and record participants’ behavior
Examine the degree of agreement among two or more people who observe and record participants’ behavior
Ways of increasing the reliability of behavioral measure:
Standardize administration of the measure
Clarify instructions and questions
Train observers
Minimize errors in coding and entering data
validity
accuracy of what the measure is supposed to measure (goals)
the degree to which a measurement procedure actually measures what it is intended to measure rather than measuring something else (or nothing at all)
To what extent does the variability in scores on the measure reflect variability in the characteristic or behavior we are trying to assess?
ex:
Face validity, construct validity, convergent validity, discriminant validity, criterion-validity, concurrent validity, predictive validity
validity chart
validity –> face validity and construct validity
contract validity –> convergent validiity and discriminant validity and criterion-validity
criterion-validity –> concurrent validity and predictive validity
face validity
the extent to which a measure appears to assess what it’s supposed to capture
- Just because something has face validity doesn’t mean that is valid
- Many measures without face validity are valid
- Some measurements are designed to lack face validity so as to disguise the purpose of the test
construct validity
the extent to which a measure of a hypothetical construct relates as it should to other measures
Hypothetical constructs
Hypothetical constructs
entities that cannot be directly observed but are inferred on the basis of empirical evidence
Ex:
Intelligence, motivation, self-esteem, attachment style
3 ways to Assess Construct Validity
convergent (or divergent) validity
discriminant validity
criterion-related validity
Convergent (or divergent) Validity
a measure correlates with other measures that it should correlate with
Embarrassability should be positively correlated with shyness but negatively correlated with self-confidence
discriminant validity
a measure does NOT correlate with other measures that it should not correlate with
Embarrassability should not correlate with IQ
criterion related validity
the extent to which a measure allows us to distinguish among participants on the basis of a particular behavioral criterion
Researchers examine whether behavioral outcomes are related to scores on the measure as expected
Two Forms of Criterion-Related Validity
concurrent validity
predictive validity
concurrent validity
scores on a measure are related as expected to a criterion that is assessed at the time the measure is administered
Example: an embarrassability scale (administered today) predicts stage fright in the current situation
predictive validity
scores on a measure are related as expected to a criterion that is assessed in the future
Example: an embarrassability scale (administered today) predicts whether students sign-up for public speaking classes next semester
Test bias
occurs when a particular measure is not equally valid for everyone
- The question is no whether various groups score differently on the test
- Rather, test bias is present when the validity of a measure is slower for some groups than for others