Chapter 5- Identifying good measurement Flashcards

Question

Correlation coefficient (R)

Answer 1

A single number that indicates how close the dots are to the line on the scatterplot. The R value can be positive or negative, which indicates the slope direction. The R value is always between -1 and 1. A strong relationship means the R value is close to -1 or 1. If there is no relationship, r will be .00 or close to it

Answer 2

The direction of the slope of the line of best fit. It can be positive, negative, or zero

Answer 3

The relationship between variables is considered to be strong when the dots in a scatterplot are close to the line

Answer 4

To assess this, we would assess the same set of participants on that measure at least twice. We would record each person’s score at time 1 and time 2 (around 2 months apart) and calculate R. If R is positive and strong (.5 or higher) the test-retest reliability is good. If positive but weak, we know that the scores have changed.

Answer 5

A low R is a sign of poor reliability if we are measuring something that should stay the same over time. If measuring IQ, it should stay the same over the span of two months. If measuring something like seasonal stress, R will be low because this is a construct that changes over time

Answer 6

To test this, we would ask two observers to rate the same participants at the same time, and then compute R. If R is positive and strong (.70 or higher), then reliability is good. If positive and weak, reliability is low. A negative correlation is rare but would indicate a problem with the observers

Answer 7

R can be used to evaluate interrater reliability when the observers are rating a quantitative variable. A statistic called kappa is more appropriate when observers are rating a categorical variable. A kappa close to 1 means that the raters agree.

Answer 8

Internal reliability is relevant for measures that use multiple items or observations to get at the same construct. A scale with 5 items that say roughly the same things worded differently should mean that a participant should answer all items consistently

Answer 9

Researchers ask the participants to answer all of the items. Then, they compute the correlations between every item and every other item. They compute the average inter-item correlation (AIC)- the average of all of these correlations. AIC from .15-.50 means that the items go well together. They compute Cronbach’s alpha- mathematically combines the AIC and the number of items in the scale. The closer it is to 1, the better the scale’s reliability

Answer 10

How well a measure measures the conceptual variables it was intended for

Answer 11

Validity and reliability are separate concepts. For example, an adult’s scale might say they weigh 50 pounds every time they step on it. It’s reliable (consistent), but not valid (the measurement isn’t accurate). Reliability is necessary for validity- a measure can be less valid than it is reliable, but it can’t be more valid than it is reliable. If a measure doesn’t correlate with itself, then how can it be more strongly associated with some other variable?

Answer 12

Face and content validity

Answer 13

Criterion, convergent, and discriminant validity

Answer 14

Abstract concepts would include happiness, intelligence, stress, and self-esteem. There is no way of directly measuring how happy someone is, although we can estimate it in multiple ways. We can know if operationalizations are measuring our construct by collecting a variety of data and evaluating it in light of our theory about the construct

Answer 15

A measure has face validity if it is subjectively considered to be a plausible operationalization of the conceptual variable in question. Measures with this validity align well with the conceptual definition. Example- head circumference would have a high face validity for hat size but low face validity for intelligence

Answer 16

A measure must capture all parts of a defined construct. Ex- a conceptual definition of intelligence could be the ability to “reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience”. To have adequate content validity, an operationalization of intelligence should include questions or items to assess each of these 7 components.

Answer 17

Evaluates whether the measure under consideration is associated with a concrete behavioral outcome that it should be associated with, according to the conceptual definition. Criterion validity is important for self report measures because the correlation can indicate how well people’s self reports predict their actual behavior.

Answer 18

1. Correlational evidence for criterion validity | 2. Known-groups evidence for criterion validity

Answer 19

For example, a sales company is choosing between aptitude test A and aptitude test B - they have face and content validity, but do they correlate with the key behavior- work success? The company can collect data to tell them how well aptitude tests are correlated with success with sales. Both sales tests are given to all current sales representatives and then their number of sales is determined- two scatter plots are made to determine the correlation between aptitude test A and sales and aptitude test B and sales. Aptitude test A has a stronger correlation- we can conclude that test A has better criterion ability as a measure of selling ability

Answer 20

Another way to gather evidence for criterion validity in which researchers see whether scores on the measure can discriminate among two or more groups whose behavior is already confirmed. For example, to validate salivary cortisol as a measure of stress, a researcher could compare the salivary cortisol levels in two groups of people- those who are about to give a speech in front of a classroom and those who are in the audience. If salivary cortisol is a valid measure of stress, people in the stress group (public speaking) should have higher cortisol levels than those in the audience.

Answer 21

An example is the Beck Depression Inventory. This is a 21 item self report scale where participants circle one of 4 choices. The scores are added to get a total from 0-63. Participants answered the inventory. Then, psychiatrists conducted clinical interviews to diagnose each person with depression (if they were depressed), as well as their level of depression. The average BDI score of the known group of depressed people was higher than the average score of the known people who were not depressed. The level of the BDI inventory also correlated with the level of depression

Answer 22

If the BDI really quantifies depression, it should be correlated with other self report measures of depression. A strong positive correlation between the 2 scores provides evidence for the convergent validity of the BDI. Convergent validity evidence also includes similar constructs, not just the same one. BDI scores were also strongly correlated with a score quantifying psychological well being. The strong negative correlation makes sense because people who are depressed are also expected to have lower levels of well being

Answer 23

The BDI should not correlate strongly with measures of constructs that are very different from depression- it should show discriminant validity with them. We would not expect the BDI to be strongly correlated with a measure of perceived physical health problems, for example. We would expect the BDI to be much more strongly correlated with similar constructs than with constructs that aren’t similar. Example- many developmental disorders have similar symptoms. We wouldn’t want a screening instrument to diagnose a child with autism when they actually have a speech delay. It’s not necessary to establish discriminant validity with random other variables- we want to focus on other variables that are “near neighbors” of the one being evaluated.

Answer 24

Convergent validity and discriminant validity- the patterns of correlations with measures of theoretically similar and dissimilar constructs. Convergent and discriminant validity are usually evaluated together, as a pattern of correlations among self report measures. A measurement should have higher correlations (higher r values) with similar traits (convergent validity) than it does with dissimilar traits (discriminant validity).

Chapter 5- Identifying good measurement Flashcards

(48 cards)