Chapter 3: Construct and Data-Evaluation Validity Flashcards

Question 1

Q

What is construct validity?

Answer

A

Construct validity asks why a difference occurred; what was responsible for the superiority of the experimental group over some control group.

Question 2

Q

What is a construct?

Answer

A

An underlying concept that is considered to be the basis for or reason that experimental manipulation had an effect.

Question 3

Q

What is the difference between internal validity and construct validity?

Answer

A

Internal validity focuses on whether an intervention/experimental manipulation is responsible for change or whether other factors can plausibly account for the effect. In contrast, construct validity addresses the presumed cause or the explanation of the causal relation between the intervention or experimental manipulation and the outcome.

Question 4

Q

What does it mean when an experiment is confounded?

Answer

A

There is a possibility that another variable co-varied (or changed along with or was embedded in the experimental manipulation) with the intervention.
- That confound could in whole or in part be responsible for the results.
- Some component other than the one of interest to the investigator might be embedded in the intervention and accounts for the findings.

Question 5

Q

Why is construct validity intriguing?

Answer

A

Because it is at the interface of methodology (e.g. conducting well-controlled, careful studies) and substantive understanding (e.g. theory and evidence about what actually explains the phenomenon of interest).

Question 6

Q

How might attention and contact with the clients be a threat to construct validity?

Answer

A

Attention and contact accorded to the client in the experimental group or differential attention across experimental and control groups might be the basis for group differences and threaten construct validity.
- This threat is salient in the context of intervention research, since the intervention may have exerted its influence because of the attention provided rather than because of special characteristics unique to the intervention.

Question 7

Q

What is a placebo?

Answer

A

A substance that has no active pharmacological properties that would be expected to produce change for the problem to which it is applied.

Question 8

Q

What is an active placebo?

Answer

A

A placebo that mimics some of the side effects of real medications.
- They do not have properties to alter the condition (e.g. depression).

Question 9

Q

When do expectations for improvement need to be controlled?

Answer

A

If an investigator wishes to draw conclusions about the specific effects of the intervention.

Question 10

Q

What is a double-blind study?

Answer

A

Both parties (staff and participants) are unaware of who received the real drug/treatment.
- Note that this doesn’t fully prevent the threat to construct validity, given that staff might easily identify who is in the medication group based on comments by the patients about side effects or a different experience.

Question 11

Q

When does construct validity need to be controlled for?

Answer

A

When researchers want to conclude why the intervention achieved its effects.
- Attention, contact and expectations must be ruled out as rival interpretations.

Question 12

Q

How is narrow stimulus sampling a threat to both external and construct validity?

Answer

A

If the investigator wishes to generalize to other stimulus conditions, then the narrow range of stimulus conditions is a threat to external validity.
If the investigator wishes to explain why a change occurred, then the problem is one of construct validity because the investigator cannot separate the construct of interest from the conditions of its delivery.

Question 13

Q

What are unintentional experimenter expectancy effects?

Answer

A

The expectancies, beliefs, and desires about the results on the part of the experimenter influence how the subjects perform.
- Expectancies may lead to changes in tone of voice, posture, facial expressions, delivery of instructions, and adherence to the prescribed procedures and hence influence how participants respond.
- They are a threat to construct validity if they provide a plausible rival interpretation of the effects otherwise attributed to the experimental manipulation.

Question 14

Q

Why is the notion of experimenter expectancies, as a threat to validity, infrequently invoked in lab settings?

Answer

A

Expectancies currently are not a plausible explanation in many lab studies.
- Procedures may be automated across al subjects and conditions, and hence there is consistency and fairly strong control of what is presented to the subject.
In many lab paradigms, expectancies are not likely candidates for influencing the specificity of a finding that is sought.
How experimenter expectancies exert their influence is unclear.
- Research assistants can be trained to do the procedure, but they might prime participants unconsciously.

Question 15

Q

What are cues of the experimental situation?

Answer

A

Seemingly ancillary factors associated with the experimental manipulation.
- Also referred to as the demand characteristics of the experimental situation.

Question 16

Q

What are demand characteristics?

Answer

A

They include sources of influence such as information conveyed to prospective subjects prior to their arrival in the experiment, instructions, procedures, and any other features of the experiment.
- These features may seem incidental, but they “pull”, promote, or prompt behavior in the subjects. Thus, changes in the subjects could be due to demand characteristics rather than the experimental manipulation.

Question 17

Q

Should demand characteristics be viewed in an all-or-nothing way?

Answer

A

No, see the chewing gum experiment: The results found an influence of gum chewing free from the demand characteristics, but also found an effect of what subjects were told to expect.

Question 18

Q

What is the first step in managing construct validity?

Answer

A

To consider the basic threats and whether they can emerge as the study is completed.

Question 19

Q

How can you address whether the new treatment and treatment as usual group generated the same level of expectancies for improvement?

Answer

A

Either in pilot work or during the study, obtain some measure of the extent to which participants expect improvement once they learned about their treatment condition.
- At the end of the study, one can see if expectations for improvement differ between the conditions and also correlate expectations at the beginning of treatment with therapeutic change.

Question 20

Q

How can you manage if experimenter expectancies influences the results?

Answer

A

Provide a standard expectation or statement to experimenters who run the subjects so that they at least hear a constant mindset from the investigator.
- This expectation is not about what the hypotheses are but might be a speech that conveys the importance of running the subjects correctly through conditions or how the findings will be important no matter how they come out.
With attention and expectations of subjects, one can measure through a questionnaire what the beliefs of the experimenters are and see if those expectations differ among experimenters and also correlate expectations with outcomes to see if they are related.

Question 21

Q

What are three ways to control for demand characteristics?

Answer

A

Post experimental inquiry
- Ask subjects at the end of an experiment about their perceptions about the purpose.
Pre-inquiry
- Subjects are exposed to the procedures, see what subjects would do, hear the rationale and instructions, but do not actually run the study itself. They are then asked to respond to the measures.
Simulators
- Subjects are asked to act as if they have received the procedures and then to deceive assessors who do not know whether they have been exposed to the actual procedures.

Question 22

Q

What is the post experimental inquiry?

Answer

A

Focuses on asking subjects about the purposes of the experiment and the performance that is expected of them.
- If subjects are aware of the purpose of the experiment and the performance expected of them, they can more readily comply with the demands of performance.
- Thus, their responses may be more a function of the information about the experiment than the manipulation itself.

Question 23

Q

What is pre-inquiry?

Answer

A

Subjects are not actually run through the procedures in the usual way, but they are asked to imagine themselves in the situation to which subjects would be exposed.
- These subjects may see the equipment that will be used, hear the rationale or instructions that will be provided, and receive all of the information that will be presented t the subject without actually going through the procedures.
- After exposing the subject to the explanations of the procedures and the materials to be used in an experiment, the subjects are asked to complete the assessment devices as if they actually had been exposed to the intervention.
- Pre-inquiry research can inform the investigator in advance of conducting further investigations whether demand characteristics operate in the direction of expected results derived from actually running the subjects.
- If pre-inquiry data and experimental data are dissimilar, this suggests that the cues of the experimental situation alone are not likely to explain the findings obtained from actually being exposed to the experimental condition.

Question 24

Q

How can the use of simulators help evaluate demand characteristics?

Answer

A

Simulators are subjects who are asked to act as if they received the experimental condition/intervention even thought they have not.
- They then run through the assessment procedures of the investigation by an experimenter who is blind as to who is a simulator and who is a real subject.
- Simulators are instructed to guess what real subjects might do who are exposed to the intervention and then to deceive a blind experimenter. If simulators can act as real subjects on the assessment devices, this means that demand characteristics could account for the results.

Question 25

Q

What does it mean if data from post-inquiry, pre-inquiry or simulators and from real subjects who completed the experiment are similar?

Answer

A

The data are consistent with a demand characteristics interpretation.
- Note that this does not inherently mean that demand characteristics account for the results.

Question 26

Q

How can one prevent the threat of narrow sampling?

Answer

A

Varying the stimulus conditions used to present the experimental manipulation if those are case materials, vignettes, brief movies, or stimulus materials that might have two rather than one version.
One assistant running the study might be supplemented by at least one more.

Question 27

Q

What are the two components of construct validity?

Answer

A

What is the IV?
- Emphasizes the fact that the IV may be confounded with or embedded in other conditions that influence and account for the findings.
Why did that lead to change?
- Emphasizes the related issue of interpretation of what led the performance on the dependent measures.

Question 28

Q

What is data-evaluation validity?

Answer

A

Those facets of the evaluation that influence the conclusions that we reach about the experimental condition and its effect.

Question 29

Q

What two standpoints is statistical evaluation often taught from?

Answer

A

Understanding the tests themselves and their bases.
- This facet emphasizes what the tests accomplish and the formulae and derivations of the tests.
The computational aspects of statistical tests.
- Concrete application of the tests to datasets, use of software, and interpretation of the findings are emphasized.

Question 30

Q

What is a third facet of statistical evaluation that might be considered at a higher level of abstraction?

Answer

A

The role of statistical evaluation in relation to research design and threats to validity.
- Data-evaluation validity reflects this level of concern with that evaluation and often is the Achilles’ heel of research.

Question 31

Q

What is the null hypothesis?

Answer

A

There are no differences between groups.

Question 32

Q

What is the probability level (alpha)?

Answer

A

The criterion for our decision making.
- This also fixes the risk of concluding erroneously that there is a difference when there is in fact not one.

Question 33

Q

What is effect size?

Answer

A

The magnitude of the difference between two(+) conditions or groups. It is expressed in standard deviation units.

Question 34

Q

What is the pooled standard variation?

Answer

A

Based on both groups combined as if they were one group and obtaining the standard deviation from that larger group.

Question 35

Q

How can you express the effect size?

Answer

A

ES = (m1-m2)/S

Question 36

Q

What is the importance of ES in a study?

Answer

A

The more methodological problems that exist within the study, the smaller the ES and the less likelihood of showing statistically significant effects.

Question 37

Q

How can we influence ES as investigators?

Answer

A

If one looks at the ES formula, the numerator includes the difference between means of the groups included in the study.
- So one way to influence ES in a study is to be very thoughtful about what groups are included. As a general rule, select different levels of the variable of interest that are most likely to make the means quite different in relation to your hypotheses.
- The first way to increase ES and the likelihood of obtaining statistically significant results is to use conditions (groups) that are as discrepant as possible in likely outcomes within the constraints of your hypotheses.
ES can be greatly influenced and controlled by attending to the denominator of the ES formula (the measure of variability).
- We can influence the ES by reducing variability in procedures to minimize the error term that is the denominator in the ES equation.
- The larger the variability (denominator), the smaller the ES for a constant difference between means (numerator).

Question 38

Q

What is power?

Answer

A

The probability of rejecting the H0 when H0 is false.

Question 39

Q

What is the threat of low statistical power to data-evaluation validity?

Answer

A

When power is weak, the likelihood that the investigator will conclude there are no differences between groups is increased.
- The conclusion of “no difference” thus might be due to low power, rather than an absence of the difference between groups.

Question 40

Q

What implications does weak power have?

Answer

A

It slows theoretical and empirical advances and utilizes resources that might be more wisely used elsewhere.
Ethical implications
- Is it ethical to subject participants to any procedures as part of an investigation if that investigation has little likelihood of detecting a difference, even if there is one?

Question 41

Q

What is the problem with subject heterogeneity?

Answer

A

The greater the heterogeneity or diversity of subject characteristics, the less likelihood of detecting a difference between conditions.
- Critical part of this statement: It is assumed that subjects are heterogeneous on a characteristic that is related to the effects of the IV (e.g. a study about cognitive behavior will have subjects w a range of different shoe sizes, but this won’t affect cog behavior).

Question 42

Q

What does heterogeneity of the sample mean?

Answer

A

That there will be greater variability in subjects’ reactions to the measures and to the intervention.
- This variability will be reflected in the denominator for evaluating ES.
- Again, the greater that variability, the lower the ES for a given difference between means and the less likely the difference in means will be significant.

Question 43

Q

What is the range?

Answer

A

The highest minus the lowest value on a given measure.
- The larger the range, the more variability.

Question 44

Q

What does variability in the procedures lead to?

Answer

A

It increases the standard deviation in the ES formula and possibly dilutes, weakens, and makes significant differences between means more difficult to detect.

Question 45

Q

What variability procedures might take place in studies where interventions are evaluated?

Answer

A

Patient adherence to various conditions can introduce biases that threaten experimental validity.
- At the end of the treatment trial, there may be no differences among the treatment conditions. This might be due to diffusion of treatment (threat to internal validity).
- In addition, the variability in implementation of a given treatment is a huge additional threat. The variability is very likely to contribute to a no difference finding.

Question 46

Q

What is reliability?

Answer

A

The extent to which a measure assesses the characteristic of interest in a consistent fashion.

Question 47

Q

Why might performance on a measure vary widely from item to item within the measure?

Answer

A

Because items are not equally clear or consistent in what they measure and hence performance may vary widely from occasion to occasion.
- To the extent that the measure is unreliable, a greater portion of the subject’s score is due to unsystematic and random variation.

Question 48

Q

Why are unreliable measures problematic?

Answer

A

Since this reflects in the denominator of the ES formula: the obtained ES is likely to be lower than it would be if more reliable measures were used.
- As unreliability of the measure increases (error), the likelihood of detecting a statistically significant effect decreases.

Question 49

Q

How can you check reliability?

Answer

A

Evaluate the reliability of the measure before primary data analysis to see the extent to which they can be assured error is relatively small.
Use multiple measures of a construct, check to see that they are related, and then combine them statistically.
- Can be done by placing all measures on a standard score and then adding the scores together.

Question 50

Q

What is the problem of restricted range?

Answer

A

Having a measure with a total possible score of 3.
- This would not spread out the groups sufficiently to show an effect.
- The restricted range relates to the numerator of the ES formula.
- Solution: Design a measure that can range from some low score to a higher score so that differences can be detected OR even better, use >10 items rather than just 1.

Question 51

Q

What is the problem when investigators create their own scale with one or a few items?

Answer

A

There are no validity data to know what those items measure, no matter what they “seem” to measure.
There are no reliability data to suggest that whatever is being measured is done with any consistency.
The very restricted range (variation) of possible scores may interfere with demonstrating group differences when such differences exist in the underlying construct.

Question 52

Q

What are unintentional errors in recording or calculating the data?

Answer

A

Inaccurately perceiving what the subject has done.
Arithmetic mistakes.
Errors in transposing data from one format to another.
Similar sources of distortion that can be systematic (scores or characteristics of subjects were miscoded or recoded in the same direction) or unsystematic (errors are random or show no pattern).

Question 53

Q

Why are data recording errors, even if they happen in a small proportion of studies, still of concern?

Answer

A

There are multiple opportunities for error in data recording, analysis, and reporting. Their cumulative impact is unknown.
Errors that appear to be careless or random more often than not are in the direction of the investigator’s hypothesis.

Question 54

Q

How might an investigator selectively report results?

Answer

A

The measures that did not yield significant outcomes may just be dropped from the study.
What was the primary or main measure of the study may be shifted in light of a look at the statistical analyses.
- The researcher replaces the original outcome measure with the one or ones that came out to be significant.

Question 55

Q

Why is selective reporting a threat to data-evaluation validity?

Answer

A

The findings might be seen as very different if all of the data were represented and if the final analyses remained true to the original predictions by keeping the main variables as the main variables.

Question 56

Q

What is the file drawer problem?

Answer

A

The possibility that the published studies represent a biased sample of all studies that have been completed for a given hypothesis.

Question 57

Q

What is the problem with completing multiple statistical tests?

Answer

A

The more tests that are performed, the more likely that the difference will be found, even if there are no true differences between conditions.
- When there are multiple comparisons, alpha is greater than .05 depending on the number of tests.
- The risk across several stat tests (experiment-wise error rate) is much greater, which can lead to misleading conclusions about group differences.

Question 58

Q

When is the threat of running multiple tests exacerbated?

Answer

A

When investigators conduct scores of tests with varied permutations of the data.

Question 59

Q

How should investigators compare two effects?

Answer

A

Researchers should report the statistical significance of the difference between the effects, rather than the difference between their significance level.

Question 60

Q

How can you determine the value of the statistical power?

Answer

A

It is a function of the criterion for statistical significance (alpha), the size of the sample (N), and the differences that exist between groups (ES).

Question 61

Q

How can you manage the threat of subject heterogeneity?

Answer

A

By selecting a homogeneous sample.
- In general, the decision of what variables to consider and how to limit the variation in the sample is based on theory or research of the effects of these and related variables on the measures of interest.
- If in doubt, one might select a relatively homogeneous set of subjects as a conservative way of addressing a threat.
Choose heterogeneous samples on purpose but ensure that the impact or effect of selected subject characteristics can be evaluated statistically in the design.

Question 62

Q

Where does variability often come from?

Answer

A

From imprecision in the script or protocol that the experimenter should follow in the experiment.
- The script refers to the specific activities, tasks, and instructions that the experimenter administers.

Question 63

Q

What is the loose protocol effect and what problems arise from it?

Answer

A

Failure to specify in detail the rationale, script, and activities of the experimenter.
1. Lack of specificity of procedures means that the investigator does not know what actually was done with the subjects and hence cannot convey the procedures to other investigators.
- The study cannot be repeated either by the original investigator or by others due to lack of important details.
2. The prospect of inconsistency among different experimenters when two or more experimenters are used to run the experiment.
- The procedures may vary systematically from experimenter to experimenter in terms of what is said to the subject, the general atmosphere that is provided, and other features.

Question 64

Q

How can you ensure that experimental procedures are conducted in a consistent fashion?

Answer

A

By training experimenters together (if there are no standardized videos/audios etc.).
- During training, experimenters can practice conducting the experiment on each other or the investigator as subjects see how the procedures are to be performed.
- By having experimenters practice and receive feedback together, relatively homogeneous behavior during the actual experiment is more readily assured.

Answer 65

A

Including confederates who discuss with the investigator what was done, how it was done, and etc. after they participate in the experiment.

Answer 66

A

This will help the investigator monitor the sorts of inconsistencies that transpire.

Answer 67

A

The first task is to provide a strong test of your hypotheses (rather than generalizability) and from the perspective of data-evaluation validity, this is better.
- If the hypotheses are supported, then extend to other samples, settings, and conditions.

Answer 68

A

A sample that is maximally heterogeneous sample with variability that is large.

Answer 69

A

By holding constant the potential sources of influence on subjects’ behavior other than the IV.
- Conditions are held constant if they are identical or very close to that across subjects and experimental conditions.
- By reducing/removing sources of variation, a more powerful test of the IV is provided.