Research Design, Statistics, & Test Construction Flashcards

1
Q

Quasi-experimental design

A

at least one IV is manipulated, but there is no random-assignment of participants (typically because already in pre-existing groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Within-subjects design

A

groups compared are correlated or related; three conditions lead to this: repeated measures of same participants, subjects matched prior to assignment to groups, subjects have an inherent relationship (e.g., twins)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Latin square

A

most sophisticated form of counterbalancing subjects in a repeated measures design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mixed design

A

includes groups that are both independent and correlated (e.g., patients randomly assigned to two different treatment groups and measured before and after treatment)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Idiographic

A

refers to single subject approaches (single or few participants studied intensely); AB, ABAB, multiple baseline, simultaneous treatment, and the changing criterion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nomothetic

A

group approaches to research design (as opposed to single subject)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Autocorrelation

A

effect of measuring same person repeatedly; results in highly correlated data; problem of single subject design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

AB design

A

baseline condition (A) followed by treatment condition (B); most significant problem is threat of history (difficult to determine whether intervention or other event caused change)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ABAB design

A

baseline (A) and treatment (B) alternated in ABAB sequence; protects against threat of history; two potential problems: failure of behavior to return to baseline, issues of ethics with removing effective treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Multiple baseline design

A

treatment is applied sequentially or consecutively across subjects, situations, or behaviors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Simultaneous (alternating) treatment design

A

two or more interventions implemented concurrently during the treatment phase that are balanced and varied across time of day

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Changing criterion design

A

attempt is made to change behavior in increments to match a changing criterion (e.g., slowly reducing number of cups of coffee)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Momentary time sampling

A

simply recording whether target behavior is present or absent at moment that time interval ends

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Whole-interval sampling

A

scoring target behavior positively only if exhibited for full duration of time interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Analogue research

A

evaluates treatment under conditions that only resemble or approximate clinical situations; typically for less severe conditions; tight experimental control but limited generalizability (e.g., grad student clinicians using manual)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Clinical trials

A

outcome investigations conducted in clinical settings; often involve methodological compromises and sacrifices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Cross-sequential research

A

also called cohort-sequential research; takes several cross sections and follows them over briefer periods of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Stratified random sampling

A

population is first divided into strata (e.g., age levels, income levels, ethnic groups), and then a random sample of equal size from each stratum is selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Proportional sampling

A

individuals are randomly selected in proportion to their representation in the general population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Systematic sampling

A

selecting every kth element after a random start, e.g., if 100 out of 1000 persons are needed, every tenth person is selected; needs to be arranged in such a way that it is not biased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Cluster sampling

A

identifying naturally occurring groups of subjects (clusters) and randomly selecting certain clusters (e.g., classes or departments at a university, or schools within a particular school district)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

History

A

threat to internal validity; incidents that intervene between measuring points, either in or outside of the experimental situation; best control is a control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Maturation

A

threat to internal validity; factors that affect the subjects’ performance because of the passing of time (fatigue, maturing); best control is a control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Testing or test practice

A

threat to internal validity; occurs when familiarity with testing affects scores on repeated testing; best control is Solomon Four-Group design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Solomon Four-Group design

A

control for testing threats to validity; divide subjects into four groups: measured pre- and post- and get intervention; measured pre- and post- and don’t get intervention, measured post and gets intervention, measured post and does not get intervention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Instrumentation

A

threat to internal validity; changes in observers or the calibration of equipment; control group corrects for this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Statistical regression

A

threat to internal validity; tendency for extreme scores (scores very much above or below the mean ) to become less extreme (closer to the mean) on retesting, even without any type of intervention; control group controls for this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Selection bias

A

threat to internal validity; caused by non-random assignment; best avoided with random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Attrition or experimental mortality

A

threat to internal validity; differential loss of subjects from the groups; to assess for this, compare subjects who drop out using t-tests on relevant variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Diffusion

A

threat to internal validity; occurs when no treatment group gets some of the treatment; difficult to eliminate completely, but tighter control over experimental situation can help

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Construct validity

A

refers to factors other than the desired specifics of our intervention that result in differences; often lumped under threats to external validity; not measuring what you think you are measuring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Attention and contact with clients

A

threat to construct validity; difficult to tell whether changes are due to treatment or attention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Experimenter expectancies

A

threat to construct validity; cues or clues transmitted to the subjects by the experimenter; Rosenthal effect; can be controlled by masking experimenter to conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Rosenthal effect

A

refers to experimenter expectancies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Demand characteristics

A

threat to construct validity; factors in the procedures that suggest how the subject should behave; control by masking subjects to their condition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

John Henry effect

A

hreat to construct validity; occurs when persons in a control group try harder than usual in the spirit of competition with the experimental group; control by making sure experimental and control groups do not know about each other and, if not possible, do not give groups any sense of competition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Threats to external validity

A

interfere with generalizability of effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Sample characteristics

A

threat to external validity; difference between sample and population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Stimulus characteristics

A

threat to external validity; features of the study with which the intervention is associated (e.g., research assessing memory functioning in the laboratory may not be generalizable to memory functioning in naturalistic settings)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Contextual characteristics

A

threat to external validity; conditions in which intervention is embedded; e.g., reactivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Reactivity

A

subjects behave in a certain way just because they are participating in research and being observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Low power

A

threat to statistical conclusion validity; diminished ability to find significant results; small sample size and inadequate interventions can contribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Unreliability of measures

A

threat to statistical conclusion validity; unreliable outcome measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Variability in procedures

A

threat to statistical conclusion validity; inconsistency in treatment procedures; especially of concern in psychotherapy outcome research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Subject heterogeneity

A

threat to statistical conclusion validity; subject heterogeneity makes it more difficult to find significant differences between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Varies directly with

A

as one variable increases so does the other (e.g., a varies directly with b in a=b/c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Varies indirectly with

A

as one variable increases the other decreases (e.g., a varies indirectly with c in a = b/c)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Ordinal data

A

involve tallying people to see which ordered category a person falls into (e.g., likert scale, SES, percentile rank); group means cannot be calculated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Interval data

A

involve obtaining numerical scores for each person, where the score values have equal intervals; no zero score or zero is absolute (e.g., IQ test, t-score, temperature); group means can be calculated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Ratio data

A

involve obtaining numerical scores for each person, where the score values have equal intervals and an absolute zero (e.g., score on EPPP, money in bank, weight, number of children)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Standard deviation

A

average deviation (or spread) from the mean in a given set of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Variance

A

standard deviation squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Positive skew

A

higher proportion of scores in the lower range of values (mode has lowest value, mean highest)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Negative skew

A

higher proportion of scores in the higher ranges of values (mean has lowest value, mode highest)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Kurtosis

A

refers to how peaked a distribution is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Leptokurtotic

A

distribution with a very sharp peak

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Platykurtotic

A

distribution that is very flat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Criterion-referenced or domain-referenced score

A

example is percentage correct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Norm-referenced score

A

provides information on how person performed relative to group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Standard scores

A

based on standard deviation from the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Z-scores

A

standard scores that correspond directly to standard deviation units; transforming into Z-scores does not normalize a distribution (exact same distribution shape); z score = (score - mean)/SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Z-scores and percentile ranks

A

-3 = .1, -2 = 2.5, -1 = 16, 0 = 50, 1 = 84, 2 = 97.5, 3 = 99.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Parameters

A

population values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Statistics

A

sample values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Standard error of the mean

A

average amount of deviation of sample means from the population mean; equal to population SD divided by square root of sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Central limit theorem

A

states that assuming an infinite number of equal sized samples (of large enough size) are drawn from the population, and the means of these samples are plotted, a normally distributed distribution of means will result; the mean of the means ( the grand mean) will equal the population mean, and the standard deviation of the means will equal the standard deviation of the population divided by the square root of sample size (the standard error of the mean); allows researcher to calculate whether the obtained mean is most likely due to treatment or experimental effects, or to chance

67
Q

Rejection region

A

also called region of unlikely values; unlikely researcher will obtain values by chance; size corresponds to alpha level

68
Q

Two factors that contribute to conclusions about statistical significance

A

treatment effects and chance (sampling error)

69
Q

Type I error

A

incorrectly reject null hypothesis; likelihood directly corresponds to size of alpha

70
Q

Type II error

A

incorrectly accept null hypothesis; corresponds to beta

71
Q

Beta

A

provides probability of making Type II error

72
Q

Power

A

ability to correctly reject null hypothesis; increased when sample size is large, magnitude of intervention is large, random error is small, statistical test is parametric, test is one-tailed; inversely related to beta (power = 1- beta); direct relationship with alpha

73
Q

Parametric test

A

three assumptions must be met: data are interval or ratio, homoscedasticity, normally distributed

74
Q

Nonparametric test

A

used for nominal or ordinal DV

75
Q

Statistic for testing differences, more than one DV

A

MANOVA

76
Q

Statistics for testing differences, interval or ratio DV

A

t-test, ANOVA

77
Q

Statistics for testing differences, nominal or ordinal DV

A

Chi-Square, Mann-Whitney, Wilcoxin

78
Q

Homoscedasticity

A

similar variability or standard deviations in the different groups

79
Q

Assumption for Chi-Square test

A

independence of observations

80
Q

Degrees of freedom

A

number of possible variations in outcomes that can be obtained

81
Q

Degrees of freedom for single sample chi-square

A

df = # of columns - 1

82
Q

Degrees of freedom for multiple sample chi-square

A

df = (# rows - 1)(# columns - 1)

83
Q

Degrees of freedom for single sample t-test

A

df = N - 1

84
Q

Degrees of freedom for matched or correlated samples t-test

A

df = # of pairs - 1

85
Q

Degrees of freedom for independent samples t-test

A

df = N - 2

86
Q

Degrees of freedom total for ANOVA

A

df = N - 1

87
Q

Degrees of freedom within for ANOVA

A

df within = df total = df between

88
Q

Degrees of freedom between for ANOVA

A

df between = # of groups - 1

89
Q

Expected frequency for Chi-Square when data are given in each cell

A

expected frequency for any cell = (sum of row * sum of column)/N

90
Q

F ratio

A

F ratio = Mean Square between groups/Mean square within groups; typically significant as it gets above 2.0

91
Q

Mean Square

A

measure of average variability

92
Q

ANOVA post-hoc tests in order of most to least protection against Type I error

A

in order of most to least conservative: Scheffe, Tukey, Duncan/Dunette/Neuman-Kuels, Fisher’s LSD

93
Q

Two-way ANOVA

A

when groups are being compared on two IVs; permits analysis of main effects and interaction effects; when interaction is significant, main effects must be interpreted in context of interaction effect

94
Q

Examine whether interaction effects in ANOVA table

A

add up diagonals in each individual 2x2 set of squares

95
Q

MANOVA

A

used when there are multiple DVs

96
Q

Coefficient of determination

A

calculated by squaring correlation coefficient; represents amount of variability in Y that is shared with or explained by X

97
Q

Assumptions of bivariate correlations

A

linear relationship between X and Y, homoscedasticity, unrestricted range of scores on X and Y

98
Q

Bivariate correlation coefficient for two interval/ratio variables

A

Pearson r

99
Q

Bivariate correlation coefficient for two ordinal variables

A

Spearman’s rho, Kendall’s Tao

100
Q

Bivariate correlation coefficient for interval/ratio and true dichotomy

A

point-biserial

101
Q

Bivariate correlation coefficient for interval/ratio and artificial dichotomy

A

biserial

102
Q

Bivariate correlation coefficient for two true dichotomie

A

Phi

103
Q

Bivariate correlation coefficient for two artificial dichotomies

A

tetrachoric

104
Q

Coefficient for curvilinear relationship between X and Y

A

Eta

105
Q

Zero-order correlation

A

examines relationship between X and Y when it is believed there are no extraneous variables affecting the relationship

106
Q

Partial correlation

A

also called first-order correlation; examines the relationship between the predictor and the criterion with the effect of a third variable removed that is thought to be affecting both variables

107
Q

Part correlation

A

also called a semi-partial correlation; examines the relationship between the predictor and the criterion with the influence of a third variable removed from only one of the original variables

108
Q

Multivariate tests

A

involve several predictors and one or more criterions (DVs)

109
Q

Multiplier R

A

multiple correlation; correlation between two or more IVs and one DV, ​​where Y is always interval or ratio data, and at least one X is interval or ratio data

110
Q

Coefficient of multiple determination

A

obtained by squaring multiple R; index of the amount of variability in the criterion (Y) that is accounted for by the combination of all the predictors (Xs)

111
Q

Multiple regression

A

Has multiple predictors

112
Q

Multicollinearity

A

problem that occurs in a multiple regression equation when the predictors are highly correlated with one another, and therefore essentially redundant

113
Q

Stepwise regression

A

computer-generated; in forward regression, the computer adds predictor variables one at a time, starting with the predictor that has the highest correlation with criterion outcome; in backward regression, predictor variables are removed one at a time, starting with the variable that contributes the least to criterion outcome; allows for fewest possible predictors

114
Q

Hierarchical regression

A

researcher controls regression analysis, adding variables in order consistent with theory

115
Q

Canonical R

A

correlation between two or more IVs and two or more DVs; evaluate relationship between two sets of variables

116
Q

Discriminant function analysis

A

special case of multiple regression; two or more predictors and one criterion that is nominal (rather than interval or ratio); allows to predict membership in group

117
Q

Loglinear analysis

A

sometimes referred to as logit analysis; used to predict a categorical criterion based on categorical predictors

118
Q

Approaches for causal modeling

A

not correlations and regressions; path analysis and SEM

119
Q

Path analysis

A

applies multiple regression techniques to testing a model that specifies causal links among variables; relies on researcher having developed theoretically-based causal model; straight arrows denote causal relationships, curved denote correlations; path coefficients are analyzed to see if the pattern predicted by the model has emerged

120
Q

Factor analysis

A

test of structure that extracts as many significant factors from set of data as possible

121
Q

Characteristic root

A

another name for eigenvalues for factors (indicate strength of factors); less than 1.0 usually not interpreted

122
Q

Factor loadings interpreted

A

equal to or exceed 0.30

123
Q

Orthogonal rotation

A

factor rotation in which axes remain perpendicular; results in factors with no correlation

124
Q

Communality

A

much of a test’s variability is explained by the combination of all the factors; can be calculated in orthogonal rotation; factor loadings squared and added together

125
Q

Oblique rotation

A

factor rotation in which angle between axes is non-perpendicular and factors are correlated

126
Q

Principal components analysis

A

subtype of factor analysis; trying to extract factors and there is no empirical or theoretical guidance on the values of the communalities; results in a few uncorrelated factors called components; no prior hypotheses

127
Q

(Principle) factor analysis

A

communality values ascertained before analysis

128
Q

Classical test theory

A

also called true score model; total variability = true score variability + error variability

129
Q

Reliability

A

proportion of true score variability; often symbolized as rxx or rtt; minimum acceptable is 0.80

130
Q

Content sampling error

A

occurs when a test, by chance, has items that do or do not tap into a test-taker’s knowledge base

131
Q

Time sampling error

A

occurs when a test is given at different points in time and scores differ because of factors related to passage of time

132
Q

Test heterogeneity error

A

occurs when a test has heterogeneous items tapping more than one domain

133
Q

Factors affecting reliability

A

number of items, homogeneity of items, range of scores, ability to guess

134
Q

Four estimates of reliability

A

test-retest reliability, parallel forms reliability, internal consistency reliability, interrater reliability

135
Q

Coefficient of stability

A

expression of test-retest reliability

136
Q

Coefficient of equivalence

A

expression of parallel forms reliability

137
Q

Spearman-Brown prophecy formula

A

used when calculating split-half reliability; tells us how much more reliable the test would be if it were longer

138
Q

Split-half reliability and speeded tests

A

split-half reliability inappropriate for speeded tests because only easy items included; preferred test of reliability is alternate forms

139
Q

Power tests

A

have items that are of varying difficulty level, and subjects are provided sufficient time to complete them all

140
Q

Kuder-Richardson (KR-20 and KR-21) and Cronbach’s coefficient alpha

A

sophisticated forms of internal consistency and reliability; involve analysis of the correlation of each item with every other item in the test; calculated by taking the mean of the correlation coefficients for every possible split-half; KR-20 (vary in difficulty) and KR-21 (consistent difficulty) when items scored dichotomously, Coefficient alpha when not scored dichotomously

141
Q

Standard error of measurement

A

standard deviation of a theoretically normal distribution of test scores obtained by one individual on equivalent tests; assumed to be consistent across all persons; Smean = SDx * square root (1 - rxx); ranges from 0 (perfectly reliable test) to the standard deviation of the test (not at all reliable test)

142
Q

Content validity

A

how adequately a test samples a particular content area; quantified by asking a panel of experts if each item is essential, useful/not essential, or not necessary, yet no numerical validity coefficient is derived

143
Q

Criterion-related validity

A

how adequately a test score can be used to infer, predict, or estimate criterion outcome; calculated by using a Pearson r to correlate the test scores (also known as predictor scores) with criterion scores (also known as outcome scores)

144
Q

Concurrent validity

A

subtype of criterion-related validity; predictor and criterion measured and correlated at about the same time

145
Q

Predictive validity

A

subtype of criterion-related validity; delay between the measurement of the predictor and the criterion

146
Q

Standard error of the estimate

A

average amount of error in estimating each person’s criterion score; standard deviation of a theoretically normal distribution of criterion scores obtained by one person measured repeatedly; Sest = SDy * square root (1-rxy2); ranges from 0 to value of standard deviation of criterion

147
Q

Applications of criterion-related validity coefficient for prediction

A

expectancy tables, Taylor-Russell tables, decision-making theory

148
Q

Expectancy tables

A

list the probability that a person’s criterion score will fall in a specified range based on the range in which that person’s predictor score fell

149
Q

Taylor-Russel tables

A

numerically describe the amount of improvement that occurs in selection decisions when a predictor test is introduced

150
Q

Selection ratio

A

proportion of available openings to number of applicants

151
Q

Incremental validity

A

amount of improvement in success rate that results from using predictor test (e.g., if proportion of successful improves from base rate of .4 to .65, incremental validity is .25 or 25%)

152
Q

Three variables that affect incremental validity

A

criterion-related validity coefficient of the predictor test, the company’s base rate, and the selection ratio

153
Q

Decision-making theory

A

takes the predictions of performance that were based on the predictor tests and compares them with the actual criterion outcome

154
Q

Item difficulty setting formula

A

(1.0 + probability of getting item by chance)/2.0

155
Q

Item validity

A

correlation between item score and criterion score

156
Q

Item-characteristic curve

A

plot of the relationship between item performance and total score

157
Q

Item response theory

A

used to calculate to what extent a specific item on a test correlates with an underlying construct; subject’s performance on a test item as representing the degree to which the subject has a latent trait

158
Q

Factors affecting criterion-related validity

A

range of scores, reliability of predictor, reliability of the predictor and the criterion, criterion contamination

159
Q

Relationship of reliability of predictor and criterion-related validity

A

test must have some reliability to be valid, but a reliable test does not imply a valid test; validity can be greater than reliability; reliability determines ceiling for validity but is not always greater than validity

160
Q

Correction for attenuation

A

calculates how much higher validity would be if predictor and criterion were both perfectly reliable

161
Q

Criterion contamination

A

occurs with subjectively-scored criterion outcomes when the rater is informed of subjects’ predictor scores before assigning them criterion ratings

162
Q

Ways evidence of construct validity can be obtained

A

factor analysis or multi-trait, multi-method matrix

163
Q

Multi-trait, multi-method matrix

A

table with information about convergent and divergent validity, both of which are necessary for construct validity