chapter 5 Flashcards
measurement
Measurement involves thinking about how to translate abstract concepts so that we can observe them consistently and accurately. No measure is perfect. No matter how precisely we attempt to measure something, it would always be possible to improve on the accuracy.
level of measurement
The relationship between
numerical values on a measure. There are different types of levels of measurement (nominal, ordinal, interval, ratio) that determine how you can treat the measure when analyzing it. For instance, it makes sense to compute an average of an interval or ratio variable but does not for a nominal or ordinal one.
nominal level of measurement
Measuring a variable by assigning a number arbitrarily in order to name it numerically so that it might be distinguished from other objects. The jersey numbers in most sports are measured at a nominal level. No ordering of the cases is implied.
ordinal level of measurement
Measuring a variable using rankings. Class rank is a variable measured at an ordinal level. Attributes can be rank-ordered, but the distances between attributes do not have any meaning.
interval level of measurement
Measuring a variable on a scale where the distance between numbers is interpretable. For instance, temperature in Fahrenheit or Celsius is measured on an interval level.
ratio level of measurement
Measuring a variable on a scale where the distance between numbers is interpretable and there is a meaningful absolute zero value. For example, weight is a ratio measurement. This means that you can construct a meaningful fraction (or ratio) with a ratio variable.
the hierarchy of levels of measurement
At lower levels of measurement, assumptions tend to be less restrictive and data analyses tend to be less sensitive. In general, it is desirable to have a higher level of measurement (such as interval or ratio) rather than a lower one (such as nominal or ordinal).
importance of level of measurement
- Knowing the level of measurement helps you decide how to interpret the data from that variable.
- Knowing the level of measurement helps you decide what statistical analysis is appropriate on the values that were assigned.
the two key criteria for evaluating the quality of measurement
Reliability and validity.
reliability
Reliability is the consistency or stability of an observation. Does the observation provide the same results each time? The foundation of reliability is based on the true score theory of measurement.
true score theory
A theory that maintains that an observed score is the sum of two components: true ability (or the true level) of the respondent; and random error. It assumes that any observation is composed of the true value plus some random error value.
true score
The true score is essentially the score that a person would have received if the score were perfectly accurate.
equation of true score theory
X = T + e
X = observed score
T = true score
e = error score, caused by different factors
random error
A component or part of the value of a measure that varies entirely by chance. Random error adds noise to a measure and obscures the true value. Random error is caused by any factors that randomly affect measurement of the variable across the sample. Random errors tend to balance out on average. It adds variability but does not affect the average performance for the group.
systematic error
A component of an observed score that consistently affects the responses in the distribution. Systematic error is caused by any factors that consistently affect measurement of the variable across the sample. Systematic errors tend to be either positive or negative consistently; because of this, systematic error is sometimes considered to be bias in measurement.
reducing measurement error
- Pilot test your instruments to get feedback from your respondents regarding how easy or hard the measure was, and information about how the testing environment affected their performance.
- Train people who collect data thoroughly
so that they aren’t inadvertently introducing error. - Double-check the data thoroughly to avoid omissions or duplication.
- Use statistical procedures to adjust for measurement error.
- Use multiple measures of the same construct to enable triangulation.
triangulation
Combining multiple independent measures to get at a more accurate estimate of a variable.
reliability
In research, the term reliability means repeatability or consistency. A measure is considered reliable if it would give you the same observation over and over again. Reliability is a ratio or fraction: True level on the measure / The entire observed measure (with error included). Reliability will always range between 0 and 1. A reliability of .5 means that about half of the variance of the observed score is attributable to true score and half is attributable to error.
variance
The variance is a measure of the spread or distribution of a set of scores. It’s the sum of the squared deviations of the scores from their mean, divided by the number of scores.
inter-rater or inter-observer reliability
The degree of agreement or correlation between the ratings or codings of two independent raters or observers of the same phenomenon. You can calculate the percentage agreement between raters in categorical measurement. Problem: doesn’t take account extent of agreement caused by chance. When the measure is a continuous one, calculate the correlation between ratings of two observers. Solution: effective communication between raters.
test re-test reliability
The correlation between scores on the same test or measure at two successive time points. You estimate test-retest reliability when you administer the same test to the same (or a similar) sample on two different occasions. The shorter the time gap, the higher the correlation.
parallel-forms reliability
The correlation between two versions of the same test or measure that were constructed in the same way, usually by randomly selecting items from a common test question pool.
internal consistency reliability
A correlation that assesses the degree to which items on the same multi-item instrument are interrelated. The most common forms of internal consistency reliability are the average inter-item correlation, the average item-total correlation, the split half correlation and Cronbach’s Alpha.
Cohen’s Kappa
A statistical estimate of inter-rater agreement or reliability that is more robust than percent
agreement because it adjusts for the probability that some agreement is due to
random chance.