Statistics Flashcards

1
Q

Cluster sampling

A

involves selecting units or groups of individuals from the population (e.g., schools, hospitals, clinics.)

exists in contrast to simple random sampling and stratified random sampling (which involve selecting individuals from the population)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Probability Sampling

A

When using probability sampling, each element in the target population has a known chance of being selected for inclusion in the sample.

Methods of probability sampling include:

  • simple random sampling,
  • stratified random sampling, and
  • cluster sampling.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Non-Parametric Tests

A

Nonparametric tests are inferential statistical tests used to analyze nominal or ordinal data (or interval or ratio data when the assumptions for a parametric test have not been met). They include:

  • chi-square test
  • Mann-Whitney U test
  • Wilcoxon matched-pairs test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Benefits of Parametric Tests

A

An advantage of the parametric tests is that they are more “powerful” than the nonparametric tests.

They include the Student’s t-test and the analysis of variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parametric Tests

A

Parametric tests are inferential statistical tests that are used when the data to be analyzed represent an interval or ratio scale and when certain assumptions about the population distribution(s) have been met - i.e., when scores on the variable of interest are normally distributed and when there is homoscedasticity (population variances are equal).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Normal Curve/Areas Under The Normal Curve

A

In a normal distribution,

  • about 68% of observations fall between the scores that are plus and minus one standard deviation from the mean,
  • about 95% between the scores that are plus and minus two standard deviations from the mean, and
  • about 99% between the scores that are plus and minus three standard deviations from the mean.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Experimentwise Error Rate

A

The experimentwise error rate (also known as the familywise error rate) is the probability of making a Type I error (which is rejecting the null hypothesis when its actually true [claiming “effect” when there is no effect”).

As the number of statistical comparisons in a study increases, the experimentwise error rate increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mixed (Split Plot) ANOVA

A

The mixed ANOVA is a type of factorial ANOVA that is used when a study includes at least one between-groups independent variable and one within-subjects independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cross-Validation/Shrinkage

A

Cross-validation refers to validating a correlation coefficient (e.g., a criterion-related validity coefficient) on a new sample. Because the same chance factors operating in the original sample are not operating in the subsequent sample, the correlation coefficient tends to “shrink” on cross-validation. In terms of the multiple correlation coefficient (R), shrinkage is greatest when the original sample is small and the number of predictors is large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

One-Way ANOVA F Ratio

A

The one-way ANOVA yields an F-ratio that indicates if any group means are significantly different. The F-ratio represents a measure of treatment effects plus error divided by a measure of error only (MSB/MSW). When the treatment has had an effect, the F-ratio is larger than 1.0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

One-Way ANOVA

A

The one-way ANOVA is a parametric statistical test used to compare the means of two or more groups when a study includes one IV and one DV that is measured on an interval or ratio scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Trend Analysis

A

Trend analysis is a type of analysis of variance that is used to assess linear and nonlinear trends when the independent variable is quantitative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sampling Distribution

How is it Used

A

The sampling distribution is used in inferential statistics to determine how likely it is to obtain a particular sample mean given the

  • population mean
  • the population standard deviation
  • the sample size
  • and the level of significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standard Error of the Mean

A

equal to the population standard deviation divided by the square root of the sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Sampling Distribution

Shape, Equal To,

A
  • The sampling distribution is normally-shaped
  • its mean is equal to the population mean,
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Sampling Distribution of the Mean

Definition

A

The sampling distribution of the mean is the distribution of sample means that would be obtained if an infinite number of equal-size samples were randomly selected from the population and the mean for each sample was calculated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dependent Variables

A

The dependent variable (DV) is the variable that is believed to be affected by the independent variable and is observed and measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Independent Variables

A

The independent variable (IV) is the variable that is believed to have an effect on the dependent variable and is varied or manipulated by the researcher in an experimental research study.

Each independent variable in a study must have at least two levels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Scales Of Measurement

A
  • nominal
  • ordinal
  • interval
  • ratio
    A nominal scale yields “frequency data” (the frequency of observations in each nominal category). Ordinal, interval, and ratio scales provide scale values or scores.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

negatively skewed distribution

A

In a negatively skewed distribution, the majority of scores are in the high side of the distribution, but a few are in the low (negative) side and the mode is greater than the median, which is greater than the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

positively skewed distribution

A

In a positively skewed distribution, most scores are in the low side of the distribution but a few scores are in the high (positive) side and the

mean is greater than the median which, in turn, is greater than the mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Skewed Distributions

A

Skewed distributions are asymmetrical distributions in which the majority of scores are located on one side of the distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Random Assignment

A

Random assignment involves randomly assigning subjects to treatment groups and is sometimes referred to as “randomization.”

It is considered the “hallmark” of true experimental research because it enables an investigator to conclude that any observed effect of an IV on the DV is due to the IV rather than to error.

(Random assignment must not be confused with random selection, which refers to randomly selecting subjects from the population.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Mode

A

The mode is the most frequently occurring score or category, and it is used as a measure of central tendency for nominal variables or variables that are being treated as nominal variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Median

A

The median is the middle score in a distribution when scores have been ordered from lowest to highest. It is used with ordinal data (and with interval and ratio data when the distribution is skewed or contains one or a few outliers).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Mean

A

The mean is the arithmetic average of a set of scores, and it can be used when scores represent an interval or ratio scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Measures of Central Tendency

A

The mean, median, and mode are the most commonly used measures of central tendency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Size of Rejection Region is defined by

A

alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Retention Region

A

The retention region is the region of a sampling distribution that contains the values that are likely to be obtained simply as the result of sampling error. When an inferential statistical test indicates that an obtained sample value is in the retention region, the null hypothesis is retained and the alternative hypothesis is rejected.

The retention region is equal to one minus alpha.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Rejection Region

A

The rejection region of a sampling distribution contains the sample values (e.g., means) that are unlikely to be obtained simply as the result of sampling error. When an inferential statistical test indicates that the obtained sample value falls in the rejection region, the null hypothesis is rejected and the alternative hypothesis is retained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Statistical Power

A

Statistical power refers to the probability of rejecting a false null hypothesis.
Power cannot be directly controlled but is increased by having a

  • large sample
  • maximizing the effects of the IV
  • increasing the size of alpha
  • reducing error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

pretest sensitization

A

which occurs when pretesting affects how subjects react to the treatment

threat to external validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

reactivity

A

which occurs when subjects respond differently to a treatment because they know they are participating in a research study

threat to external validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

reducing multiple treatment interference

A

Counterbalancing can be used to control multiple treatment interference and involves administering different levels of the IV to different groups of subjects in a different order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

multiple treatment interference

A

which occurs when subjects receive more than one level of an IV

threat to external validity

36
Q

External Validity (Pretest Sensitization, Reactivity, Multiple Treatment Interference)

A

External validity refers to the degree to which a study’s results can be generalized to other people, settings, conditions, etc.

37
Q

Selection

A

Selection threatens internal validity when groups differ at the beginning of the study because of the way subjects were assigned to groups and is a potential threat whenever subjects are not randomly assigned to groups

threat to internal validity

38
Q

Statistical regression

A

Statistical regression is a threat when subjects are selected to participate because of their extreme status on the DV or a measure that correlates with the DV and refers to the tendency of extreme scores to “regress to the mean” on retesting.

threat to internal validity

39
Q

History (validity threat)

A

History is a threat when an event that is external to the research study affects subjects’ performance on the DV in a systematic way.

threat to internal validity

40
Q

Maturation

A

Maturation is one threat to internal validity and occurs when a physical or psychological process or event occurs as the result of the passage of time (e.g., increasing fatigue, decreasing motivation) and has a systematic effect on subjects’ status on the DV.

41
Q

Internal Validity

A

Internal validity refers to the degree to which a research study allows an investigator to conclude that observed variability in a dependent variable is due to the independent variable rather than to other factors.

  • Maturation
  • History
  • Statistical Regression
  • Selection
42
Q

least squares criterion

A

The least squares criterion is used to locate the regression line so that the amount of error in prediction is minimized.

43
Q

Regression Analysis

A

Regression analysis is used to predict a score on one criterion based on the person’s obtained score on one predictor. It involves identifying the location of the regression line (“line of best fit”) and using the equation for that line, the regression equation, to make predictions.

44
Q

Standard Deviation

A

The standard deviation is a measure of dispersion (variability) of scores around the mean of the distribution. It is the square root of the variance and is calculated by dividing the sum of the squared deviation scores by N (or N - 1) and taking the square root of the result.

45
Q

Effect Size Measurement

A
  • Cohen’s d (which indicates the difference between two groups in terms of standard deviation units) and
  • eta squared (which indicates the percent of variance in the dependent variable that is accounted for by variance in the independent variable).
46
Q

Effect Size

A

An effect size is measure of the magnitude of the relationship between independent and dependent variables and is useful for interpreting the relationship’s clinical or practical significance (e.g., for comparing the clinical effectiveness of two or more treatments).

47
Q

Shared Variability

A

A correlation coefficient for two or more variables can be squared to obtain a measure of shared variability. For example, if the correlation between X and Y is .50, this means that 25% of variability in Y is shared with (or is accounted for by) variability in X.

48
Q

Cluster Analysis

A

Cluster analysis is a multivariate technique that is used to group people or objects into a smaller number of mutually exclusive and exhaustive subgroups (clusters) based on their similarities - i.e., to group people or objects so that the identified subgroups have within-group homogeneity and between-group heterogeneity.

49
Q

Mediating Variable

A

Mediating variables explain or account for the relationship between independent and dependent variables.

As an example, authoritative parenting may have positive effects on academic achievement because authoritative parenting leads to high self-efficacy beliefs (the mediator) which, in turn, leads to a high level of academic achievement.

50
Q

Moderator Variables

A

Moderator variables affect the strength or direction of the relationship between independent and dependent variables. If a treatment is more effective for reducing cigarette smoking for men than for women, gender is a moderator variable.

51
Q

MANOVA (Multivariate Analysis of Variance)

A

The MANOVA is a form of the ANOVA that is used when a study includes one or more IVs and two or more DVs that are each measured on an interval or ratio scale.

Use of the MANOVA helps reduce the experimentwise error rate and increases power by simultaneously analyzing the effects of the IV(s) on all of the DVs.

52
Q

Type II Error

A

False negative

Type II error occurs when a false null hypothesis is retained. The probability of making a Type II error is equal to beta (which is usually unknown).

53
Q

Type I Error

A

False positive

A Type I error occurs when a true null hypothesis is rejected. The probability of making a Type I error is equal to alpha, which is set by the investigator prior to collecting or analyzing the data.

54
Q

LISREL

A

LISREL is a structural equation (causal) modeling technique that is used to verify a predefined causal model or theory. It is more complex than path analysis, and it allows two-way (non-recursive) paths and takes into account observed variables, the latent traits they are believed to measure, and the effects of measurement error.

55
Q

Event sampling

A

is a method of behavioral sampling that is useful for behaviors that are rare or that leave a permanent product. It involves recording each occurrence of a behavior during a predefined or preselected event.

56
Q

Interval Recording

A

Interval recording is a method of behavioral sampling that involves dividing a period of time into discrete intervals and recording whether the behavior occurs in each interval. It is particularly useful for behaviors that have no clear beginning or end.

57
Q

Alpha

A

Alpha determines the probability of rejecting the null hypothesis when it is true; i.e., the probability of making a Type I error. The value of alpha is set by the experimenter prior to collecting or analyzing the data. In psychological research, alpha is commonly set at .01 or .05.

58
Q

alternative hypothesis

A

the opposite of the null hypothesis and is expressed in a way that implies that the independent variable does have an effect.

59
Q

Null Hypotheses

A

the null hypothesis is stated in a way that implies that the independent variable does not have an effect on the dependent variable.

60
Q

Randomized Block ANOVA

A

The randomized block ANOVA is the appropriate statistical test when blocking has been used as a method for controlling an extraneous variable (i.e., when the extraneous variable is treated as an independent variable).

It allows an investigator to statistically analyze the main and interaction effects of the extraneous variable.

61
Q

AB, ABA, ABAB

A

The AB design includes a single baseline phase and a single treatment phase. The reversal designs include, at a minimum, two baseline phases and one treatment phase (e.g., an ABA or ABAB design), with the treatment being withdrawn (“reversed”) during the second and subsequent baseline phases.

62
Q

Multiple Baseline

A

Use of the multiple-baseline design involves sequentially applying a treatment to different “baselines” (e.g., to different behaviors, settings, tasks, or subjects).

63
Q

Single Subject Designs

A

Single-subject designs include at least one A (baseline) and one B (treatment) phase and include multiple measurements of the DV at regular intervals during each phase.

64
Q

Experimental Research (True and Quasi-Experimental)

A

research involves conducting an empirical study to test hypotheses about the relationships between independent and dependent variables.

A true experimental study permits greater control over experimental conditions, and its “hallmark” is random assignment to groups.

A quasi-experimental study permits less control.

65
Q

Within-Subjects Designs

A

same subjects, receiving different levels of the IV at different times

comparisons are made “within-subjects” instead of between groups

66
Q

Interaction effect

A

interaction refers to the effects of one IV at different levels of another IV

67
Q

Main effect

A

A main effect is the effect of a single IV on the DV

68
Q

Factorial Design (Main And Interaction Effects)

A

Factorial designs are research designs that include two or more “factors” (independent variables).

They permit the analysis of main and interaction effect

69
Q

Multiple Sample Chi-Square

A

the multiple-sample chi-square test when it includes two or more variables. (When counting variables for the chi-square test, independent and dependent variables are both included.)

70
Q

Multiple Sample Chi-Square

A

the multiple-sample chi-square test when it includes two or more variables. (When counting variables for the chi-square test, independent and dependent variables are both included.)

71
Q

Single-Sample Chi-Square

A

The single-sample chi-square test is used when the study includes one variable

72
Q

Chi-Square Test (Single-Sample And Multiple-Sample)

A

The chi-square test is a nonparametric statistical test that is used with nominal data (or data that are being treated as nominal data) - i.e., when the data to be compared are frequencies in each category.

73
Q

multicollinearity

A

Ideally, predictors included in a multiple regression equation will have low correlations with each other and high correlations with the criterion. High correlations between predictors is referred to as multicollinearity.

74
Q

Multiple Regression Output

A

The output of multiple regression is a multiple correlation coefficient (R) and a multiple regression equation.

75
Q

Multiple Regression/Multicollinearity

A

Multiple regression is a multivariate technique that is used for predicting a score on a continuous criterion based on performance on two or more continuous and/or discrete predictors.

76
Q

Discriminant Function Analysis

A

Discriminant function analysis is the appropriate multivariate technique when two or more continuous predictors will be used to predict or estimate a person’s status on a single discrete (nominal) criterion. (a doctor could perform a discriminant analysis to identify patients at high or low risk for stroke using age and weight as predictors)

77
Q

Path Analysis

A

Path analysis is a structural equation (causal) modeling technique that is used to verify a pre-defined causal model or theory. It involves translating the theory into a path diagram, collecting data on the variables of interest (the observed variables), and calculating and interpreting path coefficients.

78
Q

Random Error

A

Random error is error that is unpredictable (random). Sampling error and measurement error are types of random error.

79
Q

Systematic Error/Extraneous Variables

A

Systematic error is predictable error. Extraneous (confounding) variables are a source of systematic error that affects the relationship between independent and dependent variables.

80
Q

Mixed Designs

A

Mixed designs are a type of factorial design in which at least one IV is a between-groups variable and one IV is a within-subjects variable.

81
Q

Factorial ANOVA

A

The factorial ANOVA is the appropriate statistical test when a study includes two or more IVs (i.e., when the study has used a factorial design) and a single DV that is measured on an interval or ratio scale. It is also referred to as a two-way ANOVA, three-way ANOVA, etc., with the words “two” and “three” referring to the number of IVs.

82
Q

reactivity

A

which occurs when subjects respond differently to a treatment because they know they are participating in a research study

83
Q

reactivity

A

which occurs when subjects respond differently to a treatment because they know they are participating in a research study

84
Q

multiple treatment interference

A

which occurs when subjects receive more than one level of an IV

85
Q

reducing multiple treatment interference

A

Counterbalancing can be used to control multiple treatment interference and involves administering different levels of the IV to different groups of subjects in a different order.

86
Q

Event sampling

A

is a method of behavioral sampling that is useful for behaviors that are rare or that leave a permanent product. It involves recording each occurrence of a behavior during a predefined or preselected event.

87
Q

Multiple Sample Chi-Square

A

the multiple-sample chi-square test when it includes two or more variables. (When counting variables for the chi-square test, independent and dependent variables are both included.)