Ch5 Good Measurement Flashcards

Question

example of ordinal scale

Answer 1

places in a race- 1st is faster than 2nd, but we don't know by how much

Answer 2

#'s represent equal distances between levels + there's no true zero point (zero doesn't mean 'nothing'- 0° does not mean no temperature)

Answer 3

interval scales

Answer 4

IQ scores (100 to 105 is the same distance as 105 to 110)

Answer 5

can't say that something is "twice" or "three times" as much as something else

Answer 6

#'s represent equal intervals and there IS a truly zero point (zero means "none")

Answer 7

height, distance traveled exam scores (because zero means "nothing correct")

Answer 8

how consistent results/ scores of a measure are

Answer 9

is the operationalization measuring what it's supposed to? - how accurate is it?

Answer 10

- test-retest reliability - interrator reliability - internal reliability

Answer 11

they collect (or review others') data before deciding how to operationalize something in order to see if it is reliable- that it will yield consistent patterns of results

Answer 12

refers to whether scores are consistent every time the measure is used (time 1, time 2)

Answer 13

IQ tests should have similar results at beginning (time 1) and end (time 2) of a semester

Answer 14

self-report, observational, and physiological measures

Answer 15

when a construct is expected to be relatively stable- it's not expected to change over time

Answer 16

happy mood-- expected to change over time

Answer 17

refers to consistency of scores no matter who is measuring the variable

Answer 18

two observers measure and record how often a child smiles during an hour- results should be consistent

Answer 19

pattern of answers in self-report should be consistent no matter how a question is phrased

Answer 20

self-report scales with multiple items only

Answer 21

in Diener's scale, all different q's to measure the same construct

Answer 22

- scatterplots | - correlation coefficient r

Answer 23

association claim- of one time with another, one coder with another, one version of a question and another

Answer 24

test-retest: measure head twice, two different times interrater: have two different people measure measurements should be the same/similar with some measurement error (self-report doesn't apply)

Answer 25

a slope - points are close to slope line

Answer 26

points are further from slope line

Answer 27

a single # that describes how close dots on a scatterplot are to a line drawn through them

Answer 28

- slope direction (negative, positive, or zero slope) | - strength of relationship (dots lying closer to slope indicates a stronger relationship)

Answer 29

when slope is positive, r is positive when slope is negative, r is negative

Answer 30

falls between 1.0 and -1.0

Answer 31

when relationship is strong, r is close to 1.0 or -1.0 1 indicates a strong positive slope -1 indicates a strong negative slope when relationship is weak, r is close to 0

Answer 32

measure same participants twice, then compute value of r if value for r is strong and positive (.5 or above) then test-retest reliability is good. if r is positive but weak, then it means the score changed between time 1 and time 2- poor test-retest reliability

Answer 33

two observers rate same participant at the same time, then compute r if value for r is strong and positive (0.7 or above) interrater reliability is strong if weak and positive, interrater reliability is poor, cannot trust observers' ratings negative r would indicate terrible interrater reliability

Answer 34

either retrain coders or refine operational definition

Answer 35

use r when rating quantitative variable if the variable is categorical, the correlation coefficient 'kappa' is used

Answer 36

the correlation coefficient used in interrater reliability measures the extent to which two raters place participants into the same categories works like are in that 1.0 means raters are in agreement

Answer 37

using r is relevant in internal reliability for measures that use multiple items (questions) to approach the same construct

Answer 38

set of items has internal consistency if its items correlate strongly with one another in that you can average across those items to get a single overall score for each participant

Answer 39

a correlation-based statistic to see if measurement scale has internal reliability closer to 1.0 = better reliability- 0.7 or above is considered good .9 or above means q's may be redundant

Answer 40

if good reliability, researchers can combine items if poor, researchers must revise items, or select only items that correlate strongly

Answer 41

internal and test-retest. interrater is unnecessary because there are no observers/ coders, the subject is evaluating themselves

Answer 42

construct validity- because they show us that our chosen measure for the construct is consistent and accurate (measures what it's supposed to)

Answer 43

reliable, because measurement would be consistent not valid as an intelligence test

Answer 44

by collecting a variety of data

Answer 45

- wellbeing inventory - daily smile rate - stress hormone levels

Answer 46

no, its a matter of degree ask: what is the weight of evidence in favor of this measure's validity?

Answer 47

- face validity | - content validity

Answer 48

looks like what we want it to measure

Answer 49

head circumference has good face validity for hat size, | low face validity for intelligence

Answer 50

generally by consulting experts ex: asking a panel of personality psychologists about how reasonable Diener's SWB scale is for measuring happiness

Answer 51

a way to see if our measure contains all the parts the theory says it should contain must capture all parts of a defined construct

Answer 52

conceptual definition of intelligence contains many parts (plan, ability to reason, learn quickly, etc) to have good content validity, an operationalization of intelligence should include items to asses each component- this is why IQ tests have sections

Answer 53

- criterion validity - convergent validity - discriminant validity

Answer 54

to make sure measurement is associated w something it theoretically should be associated with

Answer 55

whether the measure is related to a concrete behavioral outcome it should be related to

Answer 56

how to predict aptitude of job applicants- when there is a strong correlation between sales performance and the test, criterion validity is high (close to the slope)

Answer 57

criterion validity is especially important for self-report measures because the correlation can indicate how well people's self reports predict their actual behavior

Answer 58

typically represented by a correlation coefficient but can also be assessed with a known-groups paradigm

Answer 59

examines whether scores on the measure can distinguish among a set of groups whose behavior is already well understood

Answer 60

salivary cortisol levels- measure those about to give speech, those in audience we know giving a speech is stressful if salivary cortisol levels are a valid measure of stress, it will be higher among those giving the speach

Answer 61

beck depression inventory- 21 item self-report scale asks about major symptoms of depression

Answer 62

known-groups paradigm was used to test the criterion validity of the BDI by giving it to two groups one not depressed one diagnosed as depressed by 4 psychiatrists depressed people scored higher (closer to 63), criterion validity established - also used to calibrate low, medium, and high scores particularly

Answer 63

in a review article, SWB scale averages from various studies college students scored much higher than prisoners- such known-groups patterns provide strong evidence for criterion validity

Answer 64

whether a measure correlates strongly with other measures of the same construction

Answer 65

to see if BDI quantified depression- had adults complete BDI along w other self-report measures of depression (CES-D) scores where strongly correlated ( r was .68)

Answer 66

no single definitive outcome will establish validity validity of all parts/measures (BDI and CES-D for example) have to be established with evidence eventually may be satisfied that a measure is valid after evaluating the WEIGHT and PATTERN of the evidence

Answer 67

criterion validity

Answer 68

yes- for example SWB scores were used to establish convergent validity for the BDI- scores had negative correlation (r = -.65)

Answer 69

a measure should correlate less strongly with measures of different constructs sometimes helpful in differentiating similar diagnoses usually not relevant to establish between something completely unrelated- should be something similar but different

Answer 70

- BDI and Physical health problems weakly correlated (r = .16), evidence for discriminant validity - whether a child has autism or only a language delay - scale to diagnose learning disabilities shouldn't correlate with IQ

Answer 71

convergent and discriminant validity are usually evaluated together as a pattern of correlations among self-report measures no strict rule for what the correlation should be, just the overall pattern helps to see if the operationalization measures what its supposed to

Answer 72

a measure can be more reliable than valid, but not the other way around- needs to be consistent with itself in order to be strongly associated with something else reliability is a necessary condition for validity but it is not sufficient

Answer 73

did the researchers collect evidence that their measures have construct validity? if they didn't do it themselves, did they review construct validity evidence of others?

Answer 74

methods section

Ch5 Good Measurement Flashcards

(98 cards)