Ch5 Good Measurement Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Conceptual variables

A

(Constructs) theoretical concepts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Operational variable

A

How a variable will be measured or manipulated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to operationalize conceptual variables

A

First define construct of interest

Then create operational definition-

Think how you could quantify the construct/ turn it into a #

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 types of measures

A

1) self report
2) operational measure
3) physiological measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Self-report

+ can children do self reports?

A

Variable operationalized by recording peoples answers to Q’s about themselves in questionnaire or interview

In children research self-reports sometimes replaced w parent or teacher reports

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Example of self-report

A

Diener’s 5 item scale

+ ladder of life

Both self report measures of life satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How did Ed Deiner Operationalize “subjective well-being”

+ how did most score

A

created a 5 item questionnaire about life satisfaction on a 7 point scale

(1- strongly disagree, 7- strongly agree)

most scored above 20 (neutral)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

observational measures

behavioral measures

A

operationalizing a variable by recording observable behaviors/ physical traces of a behavior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

example of observational measure

A

happiness- how often someone smiles

allergies- how often someone sneezes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

can intelligence be observationally measured?

A

intelligence can be considered observational measures b/c people who administer the test in person are observing their intelligent behaviors (such as solving a puzzle)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

physiological measure

A

operationalizes by recording biological data

often requires equipment to amplify, record, analyze biological data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

examples of physiological measure

A

brain activity
heart rate

FMRI brain scanning for wins vs losses at Rock Paper Scissors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

which operationalization is best?

A

a single construct can be operationalized any of the 3 ways, no best one

just important that different ways of measuring show similar patterns of results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

which type of measure is mistakenly considered most accurate?

A

physiological, but it has to be corroborated with other measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

example of corroborating physiological measures with other measures

A
  • to use FMRI scans for intelligence related to brain efficiency, first participant intelligence had to be established via IQ test (behavioral measure)
  • FMRI to measure happiness could only work by first asking participants how happy they feel (self-report)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how many levels must variables have

A

at least two, to allow for change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how can levels of operational variables be coded?

A

using different scales of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

categorical variables (nominal variables)

A

levels of the variables are qualitatively distinct categories

(categorized by name only)

researchers may # levels for data entry (1-male, 2-female,) but no quantitative meaning to the numbers (1 isn’t higher than 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Quantitative variables

A

levels are coded with meaningful #’s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

example of categorical variables

A

sex, species

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

example of quantitative variables

A

height, weight, IQ scores

Dieners scale of subjective well-being

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

is Diener’s scale of subjective well-being use categorical or quantitive variables? why

A

quantitative, because the numbers have meaning- a score of 35 is higher than 7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

types of quantitive variables

A
  • ordinal scale
  • interval scale
  • ratio scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

ordinal scale

A

’s represent a ranked order, with unequal intervals between levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

example of ordinal scale

A

places in a race- 1st is faster than 2nd, but we don’t know by how much

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

interval scale

A

’s represent equal distances between levels + there’s no true zero point (zero doesn’t mean ‘nothing’- 0° does not mean no temperature)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

what kind of scale do most questionnaire’s use?

including Diener’s SWB

A

interval scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

example of interval scale

A

IQ scores (100 to 105 is the same distance as 105 to 110)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

if there’s no true zero In interval scales, what can’t researchers say, that can be said about ratio scales?

A

can’t say that something is “twice” or “three times” as much as something else

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

ratio scale

A

’s represent equal intervals and there IS a truly zero point (zero means “none”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

example of ratio scale

A

height, distance traveled

exam scores (because zero means “nothing correct”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

reliability

A

how consistent results/ scores of a measure are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

validity

A

is the operationalization measuring what it’s supposed to? - how accurate is it?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

types of reliability

A
  • test-retest reliability
  • interrator reliability
  • internal reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

what do researchers do before deciding on a measure? Why?

A

they collect (or review others’) data before deciding how to operationalize something

in order to see if it is reliable- that it will yield consistent patterns of results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

test-retest reliability

A

refers to whether scores are consistent every time the measure is used (time 1, time 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

example of test-retest reliability

A

IQ tests should have similar results at beginning (time 1) and end (time 2) of a semester

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

what kind of operationalizations can test-retest reliability apply to?

A

self-report, observational, and physiological measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

when is test-retest reliability most relevant?

A

when a construct is expected to be relatively stable- it’s not expected to change over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

example of when test-retest reliability is NOT relevant

A

happy mood– expected to change over time

41
Q

interrater reliability

A

refers to consistency of scores no matter who is measuring the variable

42
Q

example of interrater reliability

A

two observers measure and record how often a child smiles during an hour- results should be consistent

43
Q

internal reliability (internal consistency)

A

pattern of answers in self-report should be consistent no matter how a question is phrased

44
Q

for what kind of measure is internal reliability relevant?

A

self-report scales with multiple items only

45
Q

example of internal reliability

A

in Diener’s scale, all different q’s to measure the same construct

46
Q

statistical devices for data analysis

A
  • scatterplots

- correlation coefficient r

47
Q

what kind of claim is evidence for reliability an example of?

A

association claim- of one time with another, one coder with another, one version of a question and another

48
Q

how are correlations used to document reliability?

use head circumference to explain

A

test-retest: measure head twice, two different times

interrater: have two different people measure

measurements should be the same/similar with some measurement error

(self-report doesn’t apply)

49
Q

what does interrater agreement look like on a scatterplot?

A

a slope - points are close to slope line

50
Q

what does interrater disagreement look like on a scatterplot?

A

points are further from slope line

51
Q

the correlation coefficient “r”

A

a single # that describes how close dots on a scatterplot are to a line drawn through them

52
Q

in what ways can scatterplots differ?

A
  • slope direction (negative, positive, or zero slope)

- strength of relationship (dots lying closer to slope indicates a stronger relationship)

53
Q

how does the slope act when “r” is positive? negative?

A

when slope is positive, r is positive

when slope is negative, r is negative

54
Q

what is the range of the value of “r”

A

falls between 1.0 and -1.0

55
Q

what is the value of r when the relationship is strong? weak?

A

when relationship is strong, r is close to 1.0 or -1.0

1 indicates a strong positive slope

-1 indicates a strong negative slope

when relationship is weak, r is close to 0

56
Q

using “r” in test-retest reliability- how can you tell if test-retest reliability is good or poor?

A

measure same participants twice, then compute value of r

if value for r is strong and positive (.5 or above) then test-retest reliability is good.

if r is positive but weak, then it means the score changed between time 1 and time 2- poor test-retest reliability

57
Q

using “r” in interrater reliability - how to tell if interrater reliability is strong

A

two observers rate same participant at the same time, then compute r

if value for r is strong and positive (0.7 or above) interrater reliability is strong

if weak and positive, interrater reliability is poor, cannot trust observers’ ratings

negative r would indicate terrible interrater reliability

58
Q

if interrater reliability is weak, what can be done?

A

either retrain coders or

refine operational definition

59
Q

when in interrater reliability should you use ‘r’ and when should you use kappa?

A

use r when rating quantitative variable

if the variable is categorical, the correlation coefficient ‘kappa’ is used

60
Q

kappa

A

the correlation coefficient used in interrater reliability

measures the extent to which two raters place participants into the same categories

works like are in that 1.0 means raters are in agreement

61
Q

when is r used in internal reliability?

A

using r is relevant in internal reliability for measures that use multiple items (questions) to approach the same construct

62
Q

how can you tell if a set of items has internal consistency?

A

set of items has internal consistency if its items correlate strongly with one another

in that you can average across those items to get a single overall score for each participant

63
Q

cronbach’s alpha (coefficient alpha)

A

a correlation-based statistic to see if measurement scale has internal reliability

closer to 1.0 = better reliability- 0.7 or above is considered good

.9 or above means q’s may be redundant

64
Q

if a set of items have good internal reliability, what can researchers do? what if they don’t?

A

if good reliability, researchers can combine items

if poor, researchers must revise items, or select only items that correlate strongly

65
Q

what types of reliabilities are used for self-report?

A

internal and test-retest. interrater is unnecessary because there are no observers/ coders, the subject is evaluating themselves

66
Q

what are reliability and validity important in establishing?

A

construct validity- because they show us that our chosen measure for the construct is consistent and accurate (measures what it’s supposed to)

67
Q

Is head measurement as a measurement for intelligence reliable? valid?

A

reliable, because measurement would be consistent

not valid as an intelligence test

68
Q

how can we know if indirect operational measures of a construct are really measuring that construct?

A

by collecting a variety of data

69
Q

examples of indirect measures of happiness

A
  • wellbeing inventory
  • daily smile rate
  • stress hormone levels
70
Q

Can you say evidence for construct validity is or isn’t valid?

A

no, its a matter of degree

ask: what is the weight of evidence in favor of this measure’s validity?

71
Q

subjective ways to asses validity

A
  • face validity

- content validity

72
Q

face validity

A

looks like what we want it to measure

73
Q

example of face validity

A

head circumference has good face validity for hat size,

low face validity for intelligence

74
Q

how do researchers check face validity?

+ example

A

generally by consulting experts

ex: asking a panel of personality psychologists about how reasonable Diener’s SWB scale is for measuring happiness

75
Q

content validity

A

a way to see if our measure contains all the parts the theory says it should contain

must capture all parts of a defined construct

76
Q

example of content validity

A

conceptual definition of intelligence contains many parts (plan, ability to reason, learn quickly, etc)

to have good content validity, an operationalization of intelligence should include items to asses each component-

this is why IQ tests have sections

77
Q

empirical ways to asses validity

A
  • criterion validity
  • convergent validity
  • discriminant validity
78
Q

what is the point of empirical ways to asses validity

A

to make sure measurement is associated w something it theoretically should be associated with

79
Q

criterion validity

A

whether the measure is related to a concrete behavioral outcome it should be related to

80
Q

example of criterion validity

A

how to predict aptitude of job applicants-

when there is a strong correlation between sales performance and the test, criterion validity is high (close to the slope)

81
Q

which type of measure is criterion validity important for? why?

A

criterion validity is especially important for self-report measures

because the correlation can indicate how well people’s self reports predict their actual behavior

82
Q

what is criterion validity assed using?

A

typically represented by a correlation coefficient

but can also be assessed with a known-groups paradigm

83
Q

known-groups paradigm

A

examines whether scores on the measure can distinguish among a set of groups whose behavior is already well understood

84
Q

example of known-groups paradigm

A

salivary cortisol levels-

measure those about to give speech, those in audience

we know giving a speech is stressful

if salivary cortisol levels are a valid measure of stress, it will be higher among those giving the speach

85
Q

BDI

A

beck depression inventory- 21 item self-report scale asks about major symptoms of depression

86
Q

how was the known- groups paradigm used to test the BDI?

A

known-groups paradigm was used to test the criterion validity of the BDI by giving it to two groups

one not depressed

one diagnosed as depressed by 4 psychiatrists

depressed people scored higher (closer to 63), criterion validity established

  • also used to calibrate low, medium, and high scores particularly
87
Q

how did Diener’s SWB use known-groups paradigm?

A

in a review article, SWB scale averages from various studies

college students scored much higher than prisoners- such known-groups patterns provide strong evidence for criterion validity

88
Q

convergent validity

A

whether a measure correlates strongly with other measures of the same construction

89
Q

example of convergent validity

A

to see if BDI quantified depression- had adults complete BDI along w other self-report measures of depression (CES-D)

scores where strongly correlated ( r was .68)

90
Q

Can you definitively establish validity?

A

no single definitive outcome will establish validity

validity of all parts/measures (BDI and CES-D for example) have to be established with evidence

eventually may be satisfied that a measure is valid after evaluating the WEIGHT and PATTERN of the evidence

91
Q

which validity do many researchers think best predicts actual behaviors?

A

criterion validity

92
Q

can similar (not same) constructs be used to establish convergent validity?

A

yes- for example SWB scores were used to establish convergent validity for the BDI- scores had negative correlation (r = -.65)

93
Q

Discriminant validity

A

a measure should correlate less strongly with measures of different constructs

sometimes helpful in differentiating similar diagnoses

usually not relevant to establish between something completely unrelated- should be something similar but different

94
Q

example of discriminant validity

A
  • BDI and Physical health problems weakly correlated (r = .16), evidence for discriminant validity
  • whether a child has autism or only a language delay
  • scale to diagnose learning disabilities shouldn’t correlate with IQ
95
Q

what type of measures are convergent and discriminant validity usually used for? how does it help?

A

convergent and discriminant validity are usually evaluated together as a pattern of correlations among self-report measures

no strict rule for what the correlation should be, just the overall pattern helps to see if the operationalization measures what its supposed to

96
Q

can a measure be more reliable than valid? more valid than reliable?

A

a measure can be more reliable than valid, but not the other way around-

needs to be consistent with itself in order to be strongly associated with something else

reliability is a necessary condition for validity but it is not sufficient

97
Q

when you read a research study, ask about the measures:

A

did the researchers collect evidence that their measures have construct validity?

if they didn’t do it themselves, did they review construct validity evidence of others?

98
Q

where in journal articles will you find reliability and validity info?

A

methods section