Research Design and Statistics Flashcards by Erica Goodman

[internal validity]
A valid causal inference requires satisfaction
of three criteria: (a) statistical association, (b) temporal precedence, and (c) ________

nonspuriousness

How well did you know this?

Not at all

Perfectly

Spurious causes are ….

threats to internal validity

How well did you know this?

Not at all

Perfectly

External validity is the extent to which the causal association can be ___________
to or across variations in study instances

generalized

How well did you know this?

Not at all

Perfectly

An example of ___________, researchers sometimes use inadequate labels to describe study instances (e.g., label a treatment “progressive relaxation” when the treatment has many additional therapeutic components).

Construct Validity

How well did you know this?

Not at all

Perfectly

Randomized experiments are considered the gold standard for assessing ________.

causality

How well did you know this?

Not at all

Perfectly

In _________ trials, an intervention’s effects are examined under real-world conditions.
Such trials often take place outside of academic settings (e.g., community
mental health centers).

effectiveness

How well did you know this?

Not at all

Perfectly

In ______ trials, an intervention’s effects are examined under ideal circumstances,
particularly with respect to treatment implementation.

efficacy

How well did you know this?

Not at all

Perfectly

In _________ analyses, researchers
analyze outcome data from participants as a function of their original group assignment,
regardless of their level of exposure to treatment. The analysis is intended
to provide a conservative (and real-world) estimate of the treatment effect because
it is based on cases exposed to varying levels of treatment.

intent-to-treat

missing data is core problem

How well did you know this?

Not at all

Perfectly

single-case experiments are often designed to

increase _________

internal validity

How well did you know this?

Not at all

Perfectly

The _______ design is a single-case design that alternates the baseline (A) phase
(intervention absent) with an intervention (B) phase (intervention present). The
outcome of interest is assessed on multiple occasions within each phase.

ABAB

How well did you know this?

Not at all

Perfectly

In _________ designs, replication of an effect is sought over multiple baselines,
which can reflect different behaviors, settings, and/or children (just to name
a few).

multiple baseline

How well did you know this?

Not at all

Perfectly

Although inferential statistical procedures can be used to analyze data from single case
experiments, it is more common for clinicians to rely on ____________ of the data. Visual inspection is often supplemented with descriptive statistics.

visual inspection

How well did you know this?

Not at all

Perfectly

clinicians can examine _______ by comparing the averagenfrequency of the outcome across different phases of the experiment (e.g., during
the A vs. B phases).

mean changes

How well did you know this?

Not at all

Perfectly

Clinicians can also examine \_\_\_\_\_\_\_\_ in which they compare the last data point in an immediately prior phase to the first data point in
an immediately subsequent phase. If the latency of response is hypothesized to
be immediate (e.g., the behavior will reduce dramatically as soon as the intervention
is implemented), one might predict dramatic level changes between adjacent
(baseline-intervention) phases.

level shifts

How well did you know this?

Not at all

Perfectly

Clinicians can also examine _______ (or functional
form) changes by examining the rate of behavior change in different phases. For
example, the behavior might increase in a fairly linear (i.e., constant) manner during
the initial A phase and become fairly stable during the initial B phase

slope

How well did you know this?

Not at all

Perfectly

Quasi-experimental studies are experiments that lack _____ of units
to conditions.

random assignment

How well did you know this?

Not at all

Perfectly

_____________ (also called passive observational studies) are conducted
when the researcher is not actively manipulating anything (like exposure to an
intervention).

Correlational studies

How well did you know this?

Not at all

Perfectly

________ designs compare a group of participants who possess a certain characteristic
(e.g., diagnosis of attention deficit hyperactivity disorder [ADHD]) with
a group of participants who do not possess the characteristic.

Case–control

How well did you know this?

Not at all

Perfectly

In \_\_\_\_\_\_\_\_\_\_, an intact group (i.e., cohort) is followed over time to examine the
emergence of—and/or change—in some outcome of interest. These designs are classified
as longitudinal (also known as prospective) because individuals are assessed
on at least two occasions.

cohort designs

How well did you know this?

Not at all

Perfectly

If the multiple cohorts also differ in their age or some other salient developmental marker at the study’s inception, the study is called a __________
design. Such designs allow for the study of a longer developmental period over fewer
years of data collection because several developmental cohorts (e.g., toddlers, preschoolers,
and school-aged children) are embedded in the study.

cross-sequential

How well did you know this?

Not at all

Perfectly

______ is a threat to validity when naturally occurring changes are mistaken
for an intervention effect—when symptoms remit because of the passage of time
rather than the effects of an intervention.

Maturation

How well did you know this?

Not at all

Perfectly

____ is a threat to validity when some event (or constellation of events) occurs
during the study and impacts the results in a manner mistaken for an intervention
(e.g., pt exercising and it helps depression vs. treatment).

History

How well did you know this?

Not at all

Perfectly

____—also known as regression to the mean—occurs when
extreme scores tend to revert back to the mean on a subsequent evaluation.
Statistical
regression is more plausible in single-group studies in which extreme
performers (e.g., severely depressed individuals) comprise the study sample.

Statistical regression

How well did you know this?

Not at all

Perfectly

______ is a threat to validity when the pattern of participant drop-out impacts
the results in a way interpreted as an intervention effect.

Attrition

How well did you know this?

Not at all

Perfectly

____ is a threat to validity when exposing individuals to the pretest changes them in ways that might be mistaken for an intervention effect.

Testing

______ is a threat to validity when the measurement tool changes and impacts the results in a manner mistaken for an intervention effect.

Instrumentation

____ occurs in multiple-group studies when systematic differences among intervention groups can be mistaken for an intervention effect.

Selection

As such, the reliability of a measure is viewed as the ratio of true score variance to ______ (True score variance is consistent with consistency or dependability, concepts that are often invoked in discussions of reliability.)

total variance.

Kuder–Richardson Formula 20—often | abbreviated KR-20—which can be used when the items are

dichotomous

______—also known as internal structure—is the extent to which the structure of the measure is consistent with the theorized factor structure of the construct.

Structural validity

____ matrix can be used to evaluate convergent and discriminant validity.

multitrait–multimethod

In ____, three primary decisions involve (a) choosing a method of factor extraction, (b) choosing a method of factor rotation, and (c) deciding on the number of factors to retain.

EFA

In _______ rotations, the factors are assumed to be uncorrelated. In oblique rotations, the factors are assumed to be correlated—often the case in psychology.

orthogonal

Several _______ can help with this last decision—including the chi-square test, root mean square error of approximation (RMSEA), and standardized root mean square residual (SRMR). All fit indices quantify (albeit in slightly different ways) how well the model-implied covariance matrix reproduces the estimated population covariance matrix of the analysis variables.

fit indices [CFA]

One of the advantages of conducting statistical analyses on_______ (as in the case of structural equation modeling, described below), is the gain in statistical power that results when measurement error is removed from the constructs of interest.

latent variables

The ___ is used when (a) data are ordinal, or (b) when data are interval or ratio, but the distribution is highly skewed (the median is less affected by skewness)

median

In_______, the mean and median are identical. In symmetrical distributions with a single mode, the mean, median, and mode are identical

symmetrical distributions

The interquartile range captures the middle 50% of the distribution and is computed by subtracting the 25th percentile (first quartile) from the _____th percentile (third quartile)

The standard deviation (SD) captures the average distance of scores from the

mean

z = (x − M) / _______ .

Properties of the Normal z Distribution When a z-score conversion is used, the resulting z distribution has the following properties: (a) the mean is ___; (b) the standard deviation is 1; and (c) each z score represents the position of the score in relation to the mean, in standard deviation units. In other words, a z score of −1.27 denotes a score exactly 1.27 standard deviations below the mean.

percentiles can be computed by conversion of T to z: (T – 50)/______.

__________: (a) make more distributional assumptions (e.g., that the distribution is normal), (b) assume data are measured on an interval or ratio scale, (c) are conducted on actual data (as opposed to on ranks derived from data), and (d) allow researchers to test more specific hypotheses about the populations from which they are drawn.

parametric statistics

the null hypothesis specifies | that the two ________ are equal (mt = mc).

population means

When the sample data would occur relatively infrequently assuming the null hypothesis (e.g., the data would occur less than 5% of the time if the null hypothesis were true

p < .05 (sig level; alpha)

results are not declared statistically significant even though the null hypothesis is false

Type II error (Beta)

Results were excepted even though the null was true

Type I error (alpha)

_______ is the probability of correctly rejecting a false null hypothesis (i.e., finding an effect when one exists in the population)

Statistical power (1 -Beta)

What helps increase power?

``` Increase N; Increase alpha; Directional hypotheses; Large effects; More reliable measures ```

When the 95% confidence interval does not contain zero the results are:

significant

``` An example of _________: (a) no longer meeting diagnostic criteria and (b) scoring two standard errors below a pretest score on one of the primary study outcomes. ```

Clinical significance (vs statistical)

two-tailed or nondirectional means:

(leaving open the possibility that the sample mean will be larger or smaller than the population mean of 100).

A Single-sample Z test is used when

The population SD is known (e.g., intelligence tests).

A single-sample T test is used when

The population SD is unknown.

A relevant effect size is C______, which is a standardized mean difference and is computed as the ratio of the difference between the two sample means to the pooled standard deviation.

Cohen’s d (.2 small, .5 med, .8 large).

omega squared, ω2—reflect the proportion of variance in the outcome that is explained by the factors. Interpretive guidelines for omega squared are small = 0.01, medium = 0.06, and large = 0.15.

Effect size used in ANOVA

In models for which there are three or more levels of a factor (e.g., low, medium, and high levels of stress), the test of the factor’s main effect is an _______ statistical test.

omnibus

omnibus tests are typically followed by a series of additional tests (e.g., comparing each pairs of means). These more focused contrasts are often referred to as _____.

post hoc tests

Conducting multiple statistical tests raises the familywise type I error rate associated with the full set of analyses, so we use ______.

Corrections (e.g., Bonferoni, Tukey)

A ____________ can be used to contrast two or more treatment groups (e.g., CBT, interpersonal therapy [IPT], and control).

one-way between-subjects ANOVA

A ___________ can be used to examine a single cohort’s symptom levels over two or more assessments (e.g., pretest, posttest, and follow-up measures for individuals exposed to a single intervention).

one-way within-subjects ANOVA

What test design is this an example of: one might be interested in examining whether ADHD diagnosis (present or absent) and testing environment (quiet or noisy room) have an impact on performance.

two-way between subjects ANOVA

What test design is this an example of: exposing a group of children with ADHD to two levels of a psychostimulant drug dose factor (e.g., 5 and 10 mg) crossed with two levels of a testing-environment factor (e.g., quiet and noisy rooms). In other words, all children would be observed under all four study conditions (e.g., 10 mg, quiet room) and performance would be the dependent variable

two-way WITHIN subjects ANOVA

``` _________ are used in two primary manners: (a) to increase statistical power in randomized experiments (when the covariate is uncorrelated with intervention conditions, but correlated with the dependent variable); and (b) to control for possible confounding influences (i.e., controlling for variables associated with both intervention conditions and the dependent variable) in nonrandomized designs. ```

ANCOVAs

the multivariate ANOVA model (i.e., | MANOVA model) allows for _______ to be analyzed in a single model.

multiple dependent variables

In MANOVA, the actual analysis is performed on an optimized _________ (one that maximizes between group differences while minimizing within-group differences). A number of test statistics are generated by MANOVA (Pillais’s trace, Wilk’s lambda, Hotelling’s trace, and Roy’s largest root).

linear combination of the multiple dependent variables

Effect size is estimated by computing the square of the correlation (i.e., the coefficient of determination), which is the proportion of shared variance between the two variables. Common interpretative guidelines for r2 are as follows: small = .01, medium = .09, and large = .25.

Correlation effect size (r2)

Spearman’s rank correlation coefficient and Kendall’s tau coefficient are both nonparametric tests that are used when responses on the two variables are ________.

rank ordered

unstandardized linear regression coefficient (b) meaning):

for every one-unit increase in x, Y changes by b units

(Hierarchical regression analysis is sometimes confused with stepwise regression analysis, which is an _______ approach to predictor entry used more often in exploratory analyses.)

atheoretical

Models that | do not include interaction effects are referred to as _______.

additive effects models

The Mann–Whitney Test (also called the Mann–Whitney–Wilcoxon Test) is a nonparametric alternative to the _________

independent samples t-test

The Kruskal– | Wallis Test is a nonparametric alternative to the ______.

between-group ANOVA

The Wilcoxon Signed Ranks Test is a nonparametric | alternative to the __________.

paired samples t-test.

Testing for moderation is the same as testing for a _________ between a predictor and a moderator. For example, a researcher might test whether an intervention effect is moderated by participant sex to see whether the effects of the intervention on the outcome are stronger for men or women

statistical interaction

A _______ is the mechanism through which a distal predictor operates in influencing an outcome.

mediator

________ centers on understanding participants’ lived experiences and emphasizes subjective experience (e.g., understanding personal knowledge, motivations, and perspectives).

Phenomenology

The ultimate goal of ______ is to develop a theory (“grounded” in data) about a concept of interest. This approach is used when current theory is lacking, nonexistent, or incomplete.

grounded theory

qualitative research data are often analyzed using __________, a process of identifying and analyzing patterns or themes within data. This can occur either deductively (i.e., starting from a particular theory or hypothesis) or inductively (i.e., as in grounded theory above)

thematic analysis

1) triangulation (the use of multiple, varied sources of data, methods, and researchers in order to corroborate results), 2) audits (the use of an external consultant to complete an independent analysis), and 3)member checking (having participants in the study review and provide feedback on the credibility of findings).

Checking reliability and validity in qualitative research

The first step in ________ is to identify and engage stakeholders (e.g., administrators, staff, clients, etc.). Next, a needs assessment might be conducted to assess the relative priority of the needs, or “problems,” of a specific population in order to determine where resources should be allocated

Program Evaluation

Formative evaluations provide information to make needed changes early on, whereas _____ determine a program’s success once delivered.

summative evaluations

CBA examines the balance of resources/ costs spent on a program compared to the benefits to answer the question “have resources been well spent on this program?” This results in a benefit/cost ratio, the worth of a program’s outcomes divided by the program’s costs, which can then be compared to alternative programs. CBA is controversial in part because it assigns monetary values to the benefits arising from a program.

Cost-benefit Analysis

_______ sampling is when researchers collect data from individuals with specific characteristics.

Purposive

_______ sampling, a type of purposive sampling, involves participants inviting others to participate in the study.

Snowball

______ sampling uses incentives to overcome possible biases that result from snowball and other chain-referral sampling methods.

Respondent-driven

In other words, unlike _______ variables which provide a causal explanation for the relationship between variables, moderator variables affect the strength of the relationship.

mediator

_________ is useful for studying behaviors that occur infrequently, have a long duration, or leave a permanent record or other product (e.g., a completed worksheet or test).

Event sampling

_______ is an alternative to behavioral sampling and is used when a goal of the study is to observe a behavior in a number of settings. It helps increase the generalizability of a study's findings.

Situational sampling

the independent variable (experimental variance); • systematic error (error due to extraneous variables); and • random error (error due to random fluctuations in subjects, experimental conditions, methods of measurement, etc.).

three factors that can cause variability in the study's dependent variable.

when a researcher includes an extraneous variable as an independent variable in a study, the extraneous variable is also known as a ________ variable

moderator

________ is a particularly useful method in quasi-experimental research in which subjects cannot be randomly assigned to treatment groups.

Statistical control (such as using covariates)

A study has internal validity when it allows an investigator to determine if there is a ____ relationship between independent and dependent variables.

causal

Fatigue, boredom, hunger, and physical and cognitive development are potential ______ effects that can limit a study's Internal validity

maturational; The best way to control maturation is to include more than one group in the study and randomly assign subjects to groups.

History is controlled by including more than one group in the study and _________ subjects to groups.

randomly assigning

The threat of _______ can be controlled by administering the DV measure only once as a posttest, by designing the measure in a way that minimizes memory and practice effects, or by including at least two groups in the study with all groups completing the pre- and posttests so that any difference between groups on the posttest

Testing

_______ is controlled by including more than one group in the study and ensuring that all groups are subject to the same instrumentation effects, by using the same measuring devices and procedures with all subjects, and by making sure that measuring devices and procedures do not change during the course of the study.

Instrumentation

The _________ threat is avoided by not including only extreme scorers in the study or by including more than one group and ensuring that all groups consist of subjects who are similarly extreme.

statistical regression

______ is difficult to control, but pretesting can help determine if dropouts and non-dropouts differ with regard to their initial status on the DV.

Attrition

Population validity is generalization to other people and ecological validity is generalization to other ______.

settings

When a study's results have been contaminated by _______, they cannot be generalized to people who have not been pretested. Pretest · sensitization is controlled by not administering a pretest or by using the Solomon four-group design, which allows an investigator to measure the impact of pretesting on both the external and internal validity of a research study.

pretest sensitization

An interaction between _________ is often a problem when subjects are volunteers because volunteers tend to be more motivated than non-volunteers and, consequently, might be more responsive to the IV. In this situation, the study's results apply to volunteers but can't be generalized to other people. The best way to eliminate this threat is to ensure that the sample is representative of the population of interest.

selection and treatment

Research participants may respond to an independent variable in a particular way simply because they know their behavior is being observed, and this is known as _____

reactivity

The behavior of subjects can also be altered by __________, which are cues in the experimental setting that inform subjects of the purpose of the study or suggest what behaviors are expected of them.

demand characteristics

______can be controlled by using deception, unobtrusive (nonreactive) measures, or a single- or double-blind technique. When using a single-blind technique, subjects do not know which treatment group they have been assigned to; in a double-blind study, neither the subjects nor the experimenter know which group subjects have been assigned to.

Reactivity

When a study involves exposing each subject to two or more levels of an independent variable (i.e., when the study utilizes a within-subjects design), the effects of one level of the independent variable can be affected by previous exposure to another level.

(Order Effects, Carryover Effects)

The _______ designs are considered inappropriate when withdrawal of a treatment during the course of a research study would be unethical (e.g., when the treatment has successfully eliminated a self-injurious behavior)

reversal (ABAB)

Because of its insensitivity to "outliers,·· the _____ is a useful measure of central tendency when a distribution contains one or a few extreme scores.

median

One advantage of the _____ is that, of the three measures of central tendency, it is least susceptible to sampling fluctuations.

mean

Regardless of the shape of the distribution of individual scores in the population, as the sample size increases, the sampling distribution of the mean approaches a normal distribution. • The mean of the sampling distribution of the mean is equal to the population mean. • The standard deviation of the sampling distribution of the mean is equal to the population standard deviation divided by the square root of the sample size:

Central Limit theorem

The ________ is the foundation of inferential statistics. It is the sampling distribution that enables a researcher to make inferences about the relationship between variables in the population based on obtained sample data.

sampling distribution

A _____ error is more likely when alpha is low, the sample size is small, and the independent variable is not administered in sufficient intensity.

Type II

One tailed tests, when appropriate, and parametric tests vs non are helpful to increase _____.

Power

The most effective way to maximize the robustness of a parametric test is to have an equal number of _______.

subjects in each group

Numerator and denominator of F statistic are:

Mean Square Between / Mean Square Within

ANCOVA and randomized block ANOVA serve to decrease ___________ variability in order to create a stronger test.

within group

Put another way, the ___________ indicates the proportion of variability in Y that is explained by, or accounted for by the variability in X. For example, if the correlation coefficient for sales success and product knowledge is .60, then 36% (.60 squared= .36) of variability in sales success is accounted for by product knowledge.

squared correlation coefficient

If the subscript contains two different letters or numbers (e.g., "xy"), it represents the correlation between two different variables. When the subscript contains the same letters or numbers (e.g., ''xx"), it is a _________.

reliability coefficient

Canonical correlation is an extension of multiple regression that is used when two or more continuous predictors are to be used to predict status on _______.

two or more continuous criteria

________ analysis is also known as discriminant analysis and is the appropriate technique when two or more continuous predictors will be used to predict or estimate a person's status on a single discrete (nominal) criterion.

Discriminant function; examines "hit rate" or number of correct classifications

Research Design and Statistics Flashcards

(120 cards)