1.10 Measurement Flashcards
#Measurement Process. Specification of observable referents for concepts of interest?
Concept/Construct (Conceptual definition) -> Variable (Operational definition)
Conceptual Definitions?
Self-esteem: A person’s overall evaluation of his/her own worth, value, or importance.
Social support: Aid, assistance, or support that is offered in a social relationship and intended to be helpful.
Operational Definitions?
-Procedures for assigning cases to values/categories of variables.
-Specifies the activities needed to measure the variable.
+How the data will be obtained
+What questions will be asked or observation will be made
+What the response categories are
+Any other instructions needed
Approaches to Operationalizing Measured Variables?
- Self-reports
- Observations
- Archival records
Composite Measures?
Measures that combine several indicators into a single index or scale.
Example: Rosenberg Self-Esteem Scale
Self-Esteem?
- Conceptual definition: A person’s overall evaluation of his/her own worth, value, or importance.
- Operational definition: Respondent’s self-report of self-esteem based on the Rosenberg Self-Esteem Scale (1965)
#Variables/ Level of Measurement?
- Nominal
- Ordinal
- Interval
- Ratio
Nominal?ex
- Values included 2 or more non-overlapping and exhaustive categories that have no mathematical relation to each other.
- Ex: gender; race/ethnicity; college major
Ordinal?ex
- Values indicate a rank ordering of the categories.
- Ex: social class (low, middle, high); Self-rated health status (very good, good, bad, very bad)
Interval?ex
- Values indicate categories that are ordered and separated by equal intervals, but there is no true zero.
- Ex: temperature (Fahrenheit); IQ
Ratio?ex
- Same properties as interval and there is an absolute (non-arbitrary) zero. All mathematical operations are possible.
- Ex: number years schooling; income (dollars)
Measurement Error?
Observed Value= True Value + Error
Observed Value= True Value + Systematic Error+ Random Error
Kind of Measurement Error?
- Systematic error
- Random error
Systematic error?ex
- Due to recurring, systematic factors.
- Error tends to lean in one direction.
- Causes systematic distortion (bias) in measurement
- Example: social desirability bias is the tendency of respondents to answer questions in a manner that will be viewed favorably by others. It can take the form of over-reporting good behavior or under-reporting bad behavior.
Random error?ex
- Due to chance factors
- Unrelated to true differences in the concept being measured.
- Error goes in all directions.
- Presence, direction, and extent are unpredictable
- Example: ambiguous items, fatigue
Judging the Adequacy/Goodness of Measurement?ex
-Reliability
Does the operational definition measure something with consistency and stability?
Is X always X?
-Validity
Does the operational definition measure what it is supposed to measure?
Is X really X?
Effect of Random and Systematic Error on Reliability and Validity?
- Random error affects the reliability of measures as well as the validity because unreliable measures are not valid.
- Systematic error affects the validity of measures, but not the reliability.
Linkage between Reliability and Validity?
- A measure can be reliable and valid.
- A measure can be reliable but not valid.
- A measure cannot be unreliable and valid.
- Reliability is a necessary but not sufficient criterion for validity.
Reliability Assessment?
-Stability: Test-retest reliability
-Equivalence (for multi-item scale measures)
Split-half reliability
Internal consistency reliability
-Equivalence (for multiple raters/coders/observers): Intercoder reliability
Reliability Assessment?
What is assessed
How is it assessed
Example
- Test-Retest Reliability
- Split-Half Reliability
- Internal Consistency Reliability
- Intercoder Reliability
Test-Retest Reliability? assess?ex
- Stability: Does a measure provide the same scores when administered on 2 separate occasions to the same group of respondents?
- Compute correlation coefficient for 2 sets of scores
- Administered a questionnaire measuring parenting style to a group of parents, then re-administer one month later. Compute correlation coefficient.
Split-Half Reliability?assess?ex
- Equivalence: Do 2 halves of a multi-item scale provide similar scores?
- Compute correlation coefficient for 2 halves of scores (subsets randomly selected)
- Administer a 30-item scale measuring parenting style, randomly divide into 2 15-item scales, then compute the correlation between the scales.
Internal Consistency Reliability?assess?ex
- Equivalence: To what extent are the items in a multi-item scale homogeneous? (i.e., measuring the same concept)?
- Compute coefficient alpha (e.g., Cronbach’s alpha)
- Administer a multi-item scale measuring parenting style and compute Cronbach’s alpha.
Intercoder Reliability?assess?ex
- Equivalence: To what extent do 2 coders provide the same scores when using the same instrument or measure?
- Compute percentage agreement between each pair of observers. (or Kappa coefficients)
- Have social workers observe and code the same parent-child interaction and compare their agreement in coding the interaction.
Cronbach’s alpha?
Cronbach’s alpha can be written as a function of the number of test items and the average inter-correlation among the items.
Cronbach’s alpha formula?
- K is the number of items. r-bar is the average inter-item correlation.
- If you increase the number of items, you increase Cronbach’s alpha.
- As the average inter-item correlation increases, Cronbach’s alpha increases as well
- Cronbach’s alpha reflect the consistent responses of different items
Improving Reliability?
-Make sure the measure is clearly understood
+Preliminary interviews
+Pretesting
+Focus groups
-Check that the instructions to respondents/interviewers are clear.
-In multi-item scales, assess and remove items that do not hang together with other items.
-Add more items to a multi-item scale.
Validity Assessment?
-Face Validity
Judgment that a measure appears to measure what it is intended to measure.
-Content Validity
Judgment that a measure’s items cover the universe of things that represent that content.
-Criterion Validity
Concerned with the ability of an index measure to predict a criterion measure.
-Construct Validity
Concerned with the theoretical relationships between a measure and other measures.
Content Validity Concept: Parenting Style
Should include all of the following dimensions:
- Warmth
- Involvement
- Discipline
- Expectations
- Monitoring
Criterion Validity?
- One may wish to devise measures that will identify children with learning disabilities, determine a person’s ability to fly an airplane or drive a car
- The trait or behavior is called a criterion, and validation is a matter of how well scores on the measure correlate with the criterion of interest
#Criterion Validity? Measure ->Criterion
-SAT test -> College performance
-Driving practices test-> # Tickets received over 5 years
Musical ability-> Foreign language aptitude
Criterion Validity? characteristics
- Practical use
- Problems exist
Criterion Validity? explain
-Practical use: Whether the measure can predict the criterion, not in what the measure means or why it is related to the criterion
-So…..if accuracy in gun shooting correlated with success in college, then gun shooting would be a valid measure for predicting success in college
-Problems exist.
+What standards do you choose the criterion?
+What if no reasonable criterion exists?
+What if the criterion exists but practical problems prevent using it?
Construct Validity?
- Interested in the meaning of the concepts being measured
- Any concept is implied by its theoretical relations to other concepts
- The validation process begins by examining the theory underlying the concept being measured
Construct Validity? 4
- Correlations with related variables
- Convergent validity
- Discriminant validity
- Known-groups validity
Construct Validity
Example: Aggression
- The concept of aggression generally implies destructive or punitive behavior directed toward other persons or objects
- The self-reported measure of aggression includes 6 items on a 5-point scale ranging from strongly disagree (0) to strongly agree (4). The total score ranges from 0 to 24, with higher values indicating greater aggression. Sample items are:
(a) Whoever insults me or my family is asking for a fight.
(b) I can think of no good reason for ever hitting anyone
1.Correlations with related variables?
The variable should correlate (+/-) with other theoretically related variables. Aggression?
Social competence:-.78
Peer rejection:.69
Association with deviant peers:.72
2.Convergent Validity
The variable should be correlated with other measures of the same construct measured in different ways. Aggression?
Teacher reports:.80
Parent reports:.75
Observation measure of dyads engaged in task: .78
3.Discriminant Validity?
The variable should have low or moderate correlations with variables from which it should theoretically differ. Aggression?
Assertiveness:.50
Anxiety:.40
Other examples:Likeness vs. Love
4. Known-Groups Validity?
Responses to the variable should differ as expected when tested on 2 groups that should differ in their responses based on what is known about them. Aggression (Mean)*?
Juvenile offenders:22
Students assigned to detention:18
Students in regular classroom: 9
*Range =0-24, higher scores mean more aggressive behavior