Week 3 Flashcards
What are the four major levels of measurement?
Nominal; ordinal; interval; ratio
What are the two main indicators of the quality of measurement?
Reliability and validity
Define ‘level of measurement’
Level of measurement describes the relationship between numerical values on a measure.
Describe nominal level of measurement:
Measuring a variable by assigning a number arbitrarily in order to name it numerically so that is might be distringuished from other objects
Explain ordinal level or measurement
Measuring a variable using ranking
Explain interval level of measurement
Measuring a variable on a scale where the distance between numbers is interpretable
Explain ration level of measurement
Measuring a variable on a scale where the distance between numbers is interpretable and there is an absolute zero value
Why is level of measurement important?
- It helps you decide how to interpret the data from the variable
- It helps you decide what statistical analysis is appropriate on the values that were assigned.
There are two criteria for evaluating the quality of measurement. Name both and explain them.
Reliability: the consistency of measurement
Validity: The accuracy with which a theoretical construct is translated into an actual measure.
How can you infer the degree of reliability?
Does the observation provide the same results each time?
Explain true score theory:
True score theory maintains that every observable score is the sum of two components: the true ability of the respondent on that measure; and random error
What’s a ‘true score’?
Essentially the score that a person would have received if the score were pretty accurate
Why is true score theory important?
- It is a simple yet powerful model for measurement
- It is the foundation of reliability theory
- It can be used in computer simulations as the basis for generating observed scored with certain known properties.
What if some errors are not random, but systematic.
One way to deal with this is to revise the simple true score model by dividing the error component into two subcomponents, random error and systematic error
What is ‘random error’?
Random error is a component or part of the value of a measure that varies entirely by chance.
What is ‘systematic error’?
Systematic error is a component of an observed score that consistently affects the response in the distribution.
What’s the difference between random error and systematic error?
Unlike random error, systematic errors tend to be either positive or negative consistently; because of this, systematic error is sometimes considered to be bias in measurement
How can you reduce measurement errors?
- Pilot test your instruments and get feedback from respondents
- Train the interviewers or observers thoroughly
- Double-check the data for your study thoroughly
- Use statistical procedures to adjust for measurement error
- Use multiple measures of the same construct
Name and explain the four types of reliability
- Inter-rater or inter-observer reliability is used to assess the degree to which different raters/ observers give consistent estimates of the same phenomenon
-
Test-retest reliability is used to assess the consistency of an observation from one time to another
3.Parallel-forms reliability is used to assess the consistency of the results of two tests constructed in the same way from the same content domain
4.Internal consistency reliability is used to assess the consistency of results across items within a test.
Explain Cohen’s Kappa
Cohen’s Kappa, a statistical estimate of inter-rater reliability that is more robust than percent agreement because it adjusts for the probability that some agreement is due to random chance, was introduced to avoid the problem
Explain Cronbach’s Alpha
Cronbach’s Alpha takes all possible split halves into account. So Cronbach’s Alpha is mathematically equivalent to the average of all possible split-half etimates
There are 4 different internal consistency measures. Name them and explain them shortly
- Average inter-item correlation uses all of the items on your instrument that are designed to measure the same construct.
- The average item-total correlation involves computing a total score across the set of items on a measure and treating that total score as though it were another item, thereby obtaining all of the item-to-total score correlations.
- In split half reliability, you randomly divide into two sets all items that measure the same construct. It’s an estimate of internal consistency reliability that uses the correlation between the total score of two randomly selected halves of the same multi-item test or measure.
- Cronbach’s Alpha
Define Construct validity:
Overarching category of validity that contributes to the quality of measurement, with all of the other measurement validity labels falling beneath it.
Construct validity is an assessment of how well your actual programs or measures reflect your ideas or theories.
Construct validity is all about representation, and it can be viewes as a truth in labeling issues.
There are various validity types, name them.
- Construct validity
- Translation validity
- Face validity
- Content validity
- Criterion-related validity
- Predictive validity
- Concurrent validity
- Convergent validity and discriminant validity
Shortly explain construct validity
The approximate truth of the conclusion or inference that your operationalization accurately reflects its construct
Shortly explain translation validity
Focuses on whether the operationalization is a good translation of the constrcut
Shortly explain face validity
Face validity is a validity that checks that on its face the operationalization seems like a good translation of the construct.
Explain content validity
In content validity, you check the operationalization against relevant content domain for the construct.
Explain the criterion-related validty
Examines whether the operationalization or the implementation of the construct performs the way it should according to some criterion
Explain predictive validity
Based on the idea that your measure is able to predict what is theoretically should be able to predict
Explain concurrent validity
About an operationalization’s ability to distinguish between groups that it should theorec