Statistics Flashcards

Question

What are threats to both internal and external validity?

Answer 1

* Experimenter bias * Demand characteristics and participant reactivity

Answer 2

Mean squared deviation (MS) = Sum of Squares (SS) / *df* *F* = MS_BETWEEN / MS_WITHIN

Answer 3

Confounding variables

Answer 4

Confounding variable

Answer 5

* A repeated-measures ANOVA uses a _single sample_, with the same set of individuals measured in all of the different treatment conditions * Thus, one of the characteristics of a repeatedmeasures (aka within-subjects) design is that it _eliminates variance caused by individual differences_ * Individual differences are those participant characteristics that vary from one person to another and may influence the measurement that you obtain for each person * e.g., age, gender, etc.

Answer 6

α_EW = 1 - (1 - α_TW)^c Where *c* = number of comparisons

Answer 7

Turkey's HSD

Answer 8

Observational, cross-sectional or longitudinal studies

Answer 9

*F* = (treatment effect + other effect) / other effect Numerator = between treatment variance Denominator = within treatment variance Individual differences not considers due to repeated measures

Answer 10

Internal: low External: high

Answer 11

The probability of making at least on type I error amongst a series of comparisons

Answer 12

a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among group means in a sample

Answer 13

Specific hypothesis: Planned comparisons No hypothesis: Post hoc tests

Answer 14

* Conduct naturalistic research * Switch from between-subjects to a withinsubjects or matched-subjects design. * Replicate study in different setting/with different experimenter

Answer 15

Eta-squared (η²) η² = SS_between / SS_total

Answer 16

* Use multiple response measures (e.g., selfreport, observation, physiological). * Systematically vary time of measurement as an IV and measure effect on DV and other IV.

Answer 17

That the mean for condition/sample 1 is _lower_ than the mean for condition/sample 2

Answer 18

If assumption is violated, apply a correction factor (of epsilon) to the degrees of freedom – This will in turn adjust the p value for the ANOVA If Greenhouse-Geisser epsilon is less than .75 then use Greenhouse-Geisser If Greenhouse-Geisser epsilon is greater than .75 then use Huynh-Feldt epsilon correction

Answer 19

Pillai's trace

Answer 20

1. Independence of observations * The observations within each sample must be independent 2. Interval/ratio level of data 3. Normality * Poplations must be normally distributed as determined by Shapiro-Wilk 4. Homogeneity of variance

Answer 21

* Add control group for comparison purposes * Switch from a within-subjects to a betweenor matched-subjects design * Control/limit time between testing * Counterbalance order of presentation of conditions across participants

Answer 22

* Manipulation of the independent variable (IV) to create two or more treatment conditions (levels). * Measurement of a dependent variable (DV) to obtain a set of scores for each treatment condition (level) * Comparison of the DV scores for each treatment condition (level) * Control of all other (extraneous) variables to ensure that they do not confound the relationship between IV and DV. * Random assignment of participants to each condition so that the groups can be considered truly equivalent.

Answer 23

Test-wise error rate (α_TW)

Answer 24

Homogeneity

Answer 25

the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study

Answer 26

k = 4 orthogonal contrasts = k - 1 Therefore, there are 3 orthogonal contrasts

Answer 27

1. Treatment effects: the differences are caused by the treatment(s) 2. Chance: the differences are simply due to chance

Answer 28

*p* \> .05, shape is not significantly different from normal, this normality assumption is met

Answer 29

* F* (*df_between, df_within) = value, p = value* i. e. *F*(2, 12) = 23.49, *p* \< .001

Answer 30

With an *F*-ratio of around 1 we would conclude that there is no treatment effect

Answer 31

One-way repeated-measures ANOVA

Answer 32

True Repeated-measures designs are powerful because they remove individual differences

Answer 33

In terms of the F-ratio for a repeated measures design, the variance between treatments (the numerator) **does not** contain any individual differences

Answer 34

During Post hic tests, if sample sizes are slightly different then use **Gabriel's** procedure because it has greatest power, but if sample sizes are very different use **Hochberg's GT2**

Answer 35

Type I error

Answer 36

One-way independent measures ANOVA

Answer 37

the validity of applying the conclusions of a scientific study outside the context of that study. Tt is the extent to which the results of a study can be generalized to and across other situations, people, stimuli, and times

Answer 38

Each time you run a hypothesis test, you run the risk of commiting a type I error

Answer 39

A type of reseach used to assess changes over an extended period of time

Answer 40

Cohen's d *d* = mean difference / SD

Answer 41

False; This table relates to MANOVA rather than ANOVA

Answer 42

Extraneous variables

Answer 43

* If sample sizes are large and equal... * ANOVA can handle normality violation * If sample sizes are small or not equal... * Transform your data * Run a Kruskal-Wallis test as the nonparametric alternative to a one-way independent-measures ANOVA

Answer 44

* Same participants in all conditions * Therefore, scores across conditions will correlate * Violates assumption of independence! * Because of this, an additional assumption is required for repeated-measures ANOVA – namely, sphericity * Put crudely, the assumption of sphericity means that the correlation between treatment levels should be the same * Actually, it assumes that the variances of the differences between treatment levels are equal

Answer 45

True experimental research

Answer 46

The sample populations would be all equal to each other and the same as the original population

Answer 47

1. State the hypotheses (H0 and H1) 2. Decide when to reject H0 3. Calculate the test statistic. In this case, the *F* ratio 4. Make a decision about H0 (reject/don't reject

Answer 48

* The individual differences are automatically removed from the numerator because the design uses the same subjects in all treatments, but we must also remove them from the denominator * Remove individual differences from the denominator by measuring the variance within treatments and then subtracting the individual differences * The result is a measure of unsystematic error variance that does not include any individual differences

Answer 49

* Environmental variables * Individual differences * Time-related variables * Participant attrition * Communication between groups

Answer 50

* Post-hoc tests * No specific hypotheses at outset; Compare each group to each other but use a smaller α to limit type I error rate * Planned comparisons * Specific hypotheses at outset; make specific comparisons by breaking down the between treatment variance (total variance accounted for by model) into its component parts

Answer 51

The differences between treatments are greater than chance

Answer 52

any variables that you are not intentionally studying in your experiment or test

Answer 53

ANOVA (more than 2 groups)

Answer 54

This variability can be further broken down to test specific hypotheses about which groups might differ from one another We break down the variance according to hypotheses made a priori (before the experiment) Providing that the hypotheses are independent of one another, the experimentwise type I error will be controlled

Answer 55

An experimental design that that looks a bit like an experimental design but lacks the key ingredient -- random assignment

Answer 56

H₀: There really are no differences between the populations (or treatments). The observed differences between samples are due to chance (sampling error) H₁: The differences between the sample means represent real differences between the populations (or treatments). That is, at least 1 of the treatments really do have different means, and the sample data accurately reflect these differences

Answer 57

* Conduct a double-blind study (i.e., neither participant nor experimenter know which condition the participant is in)

Answer 58

In ANOVA, an independent variable (IV) is called a **factor** Each (treatment) condition of a factor is called a **level**

Answer 59

* Create equivalent groups using random assignment, holding constant, or matching * Switch from a between-subjects to a withinsubjects or matched-subjects design

Answer 60

Quasi-experimental This is quasi-experimental because participants (students) were not randomly assigned. There may indeed by some small differences between the groups

Answer 61

An eta squares with the effects of individual differences removed from the denominator

Answer 62

Mauchly's test Sphericity assumption is met if variances are roughly equal. Therefore assumption is met when p \> .05

Answer 63

*F* = (treatment effect + individual differences + other error) / (individual differences + other error) Numerator = variability between treatments Denominator = variability within treatments

Answer 64

a type of experimental design and is thought to be the most accurate type of experimental research that supports or refutes a hypothesis using statistical analysis.

Answer 65

the ratio of the between group variance to the within group variance

Answer 66

Experiment-wise error rate (α_EW)

Answer 67

If p \> .05, the variances are not significantly different from one another, thus the homogeneity of variance assumption is met

Answer 68

* The independent variable was not experimentally manipulated (i.e., pre-existing levels are selected and compared); or * The participants were not randomly assigned to conditions (e.g., groups were selected for analysis after the fact).

Answer 69

* Switch from a within-subjects to a betweenor matched-subjects design * Conduct a blind study * Use measures which do not explicitly refer to construct being measured

Answer 70

Internal validity

Answer 71

Bonferroni α_TW = α_{EW (desired)}/ number of tests

Answer 72

these results can be used in place of the regular ANOVA results if the sphericity or normality assumptions are violated. Whilst these tests are more robust than ANOVA to the assumption violations, they are also less powerful

Answer 73

* Use control group at reference point * Only comparing 2 chunks of variation * Independence (orthogonal)

Answer 74

* Conduct blind study (i.e., participants do not know which condition they are in) * Switch to a within-subjects design * Limit possibility of communication between groups (e.g., different locations)

Answer 75

* Use a probability sampling method such as proportionate stratified random sampling, or a non-probability method which tries to achieve the same result * Increase sample size

Answer 76

The alpha level used for each comparison

Answer 77

You need to state that the homogeneity of variance assumption is violated, and that that is why you used the Brown-Forsythe or Welch F-ratio instead. You then simply report the results as usual – except that you used two decimal places for second df value (since such F-ratio calculations are based on adjustments being made to the df)

Answer 78

There are three criteria that must be met in this type of experiment 1. Control group and experimental group 2. Researcher-manipulated variable 3. Random assignment

Answer 79

Levene statistic

Answer 80

Within each treatment, participants are treated the same, so chance would cause differences

Answer 81

The t-test generates a **t**-value, which is used to then determine a p-value

Answer 82

Bonferroni

Answer 83

*F* = variance between sample means / variance expected by chance

Answer 84

False A Greenhouse-Geisser correction changes the **degrees of freedom**

Answer 85

A test used to determine whether there are any statistically significant differences between the means of two or more independent (unrelated) groups

Answer 86

A type of ANOVA used to determine whether three or more group means are different where the participants are the same in each group

Answer 87

Strongly suggests little or no treatment effect

Answer 88

A test used to compare the observed distribution to an expected distribution, in a situation where we have two or more categories in a discrete data. In other words, it compares multiple observed proportions to expected probabilities.

Answer 89

a procedure for testing if two categorical variables are related in some population

Answer 90

a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups of an independent variable on a continuous or ordinal dependent variable

Answer 91

non-parametric statistical test similar to the parametric repeated measures ANOVA, it is used to detect differences in treatments across multiple test attempts

Answer 92

1. How many dependent and independent variables there are 2. What scales of measurement are used for each variable 3. How many groups there are, and, whether these are independent- or repeated-measures 4. Whether the assumptions have been met for a parametric statistical test

Answer 93

Nominal/ordinal

Answer 94

Interval/ratio

Answer 95

Friedman test

Answer 96

Chi-square goodness-of-fit (nominal) Kruskal-Wallis test (ordinal)

Answer 97

Chi-square test-for-independence (nominal) Spearman correlation (ordinal)

Answer 98

To examine the relationship between texting and driving skill, a researcher uses orange cones to set up a driving circuit. A group of probationary drivers is then tested on the circuit, once while receiving and sending text messages and once without texting. For each driver, the researcher records the number of cones hit while driving each circuit. (Based on Gravetter & Wallnau, 2013, p. 674) Which of the following is a suitable inferential statistics test for these data? a) Independent-samples t-test **b) Paired-samples t-test** c) Repeated-measures ANOVA d) Linear regression

Answer 99

“Hallam, Price, and Katsarou (2002) investigated the influence of background noise on classroom performance for children aged 10 to 12. In a similar study, students in one classroom worked on an arithmetic task with calming music in the background. Students in a second classroom heard aggressive, exciting music, and students in a third room had no music at all. The researchers measured the number of problems answered correctly for each student to determine whether the music conditions had any effect on performance.” (Gravetter & Wallnau, 2013, p. 674) Which of the following would be an appropriate statistical test for these data? a) Chi-square b) Spearman correlation c) Independent-samples t-test **d) Independent-measures ANOVA**

Answer 100

“Belsky, Weintraub, Owen, and Kelly (2001) reported the effects of preschool childcare on the development of young children. One result suggests that children who spend more time away from their mothers are more likely to show behavioral problems in kindergarten. Suppose that a kindergarten teacher is asked to rank order the degree of disruptive behavior for the n = 20 children in the class. Researchers then separate the students into two groups: children with a history of preschool and children with little or no experience in preschool. The researchers plan to compare the ranks for the two groups.” (Gravetter & Wallnau, 2013, p. 675) Which of the following is the appropriate statistical test for these data? **a) Mann-Whitney U-test** b) Wilcoxon signed ranks test c) Chi-square test-for-independence d) Independent-samples t-test

Answer 101

“A researcher would like to determine whether infants, age 2 to 3 months, show any evidence of color preference. The babies are positioned in front of a screen on which a set of four colored patches is presented. The four colors are red, green, blue, and yellow. The researcher measures the amount of time each infant looks at each of the four colors during a 30 second test period. The color with the greatest time is identified as the preferred color for the child.” (Gravetter & Wallnau, 2013, p. 674) Which of the following would be an appropriate statistical test for these data? a) Single-sample t-test b) Independent-measures ANOVA **c) Chi-square goodness-of-fit test** d) Chi-square test-for-independence

Answer 102

Chi-square tests are intended for research questions concerning the **proportion** of the population in different categories

Answer 103

Nominal data

Answer 104

Chi-square Goodness-of-Fit Test (1 nominal variable) Chi-square Test-for-Independence (2 nominal variables)

Answer 105

The chi-square goodness-of-fit test uses **frequency data** from a sample to test hypotheses about the shape or proportions of a population

Answer 106

Observed frequencies

Answer 107

The null hypothesis specifies the proportion of the population that should be in each category The null hypothesis for the chi-square test for goodness of fit typically falls into one of two types: 1. a no-preference hypothesis which states that the population is distributed evenly across the categories, or 2. a no-difference hypothesis which states that the population distribution is not different from an established distribution

Answer 108

The proportions from the null hypothesis of a chi-square goodnes-of-fit are used to construct an ideal sample distribution, called **expected frequencies (f_e)**, that describe how the sample would appear if it were in perfect agreement with the null hypothesis

Answer 109

*f_e = pn* Where: * p* = the proportion stated in H₀ * n* = sample size

Answer 110

True expected frequencies are hypothetical values

Answer 111

True χ² can never be negative as the residuals (fo – fe ) are squared

Answer 112

Larger discrepancies between f_o and f_e produce **larger** χ ² values

Answer 113

Cohen's w w = sqrt(*X*²/N)

Answer 114

The chi-square test-for-independence is used to test whether or not there is a **relationship** between two categorical (nominal) variables

Answer 115

The null hypothesis for the chi-square test-forindependence can be phrased two ways: 1. there is no relationship between the two variables (they are independent); or 2. the distribution for one variable is the same (has the same proportions) for all the categories of the second variable

Answer 116

*df* = (R - 1)(C - 1) Where: R = number of rows C = number of columns

Answer 117

1. The null hypothesis is used to construct an idealised sample distribution of expected frequencies (fe ) that describes how the sample would look if the data were in perfect agreement with the null hypothesis (see picture) 2. A chi-square statistic is then computed to measure the amount of discrepancy between the ideal sample (expected frequencies from H0 ) and the actual sample data (the observed frequencies, fo )

Answer 118

For 2 x 2 table use _phi coefficient_ For tables larger than 2 x 2 use _Cramer's V_

Answer 119

1. Independence of observations 2. Expected frequencies should be at least 5

Answer 120

Mann-Whitney U-test

Answer 121

Wilcoxon signed-ranks test

Answer 122

For Mann-Whitney and Wilcoxon tests, the **smaller** the test statistic, the larger the difference between groups or conditions

Answer 123

the ranks for one group are not systematically higher or lower than the ranks for another group

Answer 124

difference scores are not systematically positive or negative

Answer 125

Kruskal-Wallis test

Answer 126

ANOVA requires interval or ratio scale scores that can be used to calculate means and variances The Kruskal-Wallis test, on the other hand, simply requires that you are able to rank order the individuals for the variable being measured

Answer 127

A **Kruskal-Wallis test** can be used as the nonparametric alternative to a one-way independentmeasures ANOVA if the assumptions of the ANOVA are violated

Answer 128

The Kruskal-Wallis test is similar to the Mann-Whitney test. However, the **Mann-Whitney test** is limited to comparing only two treatments, whereas the **Kruskal-Wallis test** is used to compare three or more treatments

Answer 129

There is no tendency for the ranks in any treatment population to be systematically higher or lower than the ranks in any other treatment population.

Answer 130

The ranks in at least one treatment population are systematically higher or lower than the ranks in another treatment population.

Answer 131

1. Combine the individuals from all the separate samples and rank order the entire group * i.e., rank all scores without regard to treatment condition 2. Regroup the individuals into the original samples and compute the sum of ranks (T) for each sample * i.e., add up the ranks for each treatment condition 3. The following formula is used to compute the KruskalWallis statistic – which is distributed as a chi-square statistic with degrees of freedom equal to the number of samples minus one

Answer 132

If the null hypothesis is true, we would expect the sums of ranks (T’s) to be more or less equal (aside from differences due to the sizes (n’s) of the samples). Thus, the Kruskal-Wallis statistic measures the degree to which the T’s differ from one another.

Answer 133

False Kruskal-Wallis **does not** assume normality and homogeneity of variance

Answer 134

Give tied scores the average of the affected rank positions

Answer 135

Like with the Mann-Whitney U-test, the **Mean Ranks** provide information about which groups had larger values than others.

Answer 136

number of comparisons = k(k - 1) / 2 where k = number of treatment conditions

Answer 137

The **Friedman test** is used to evaluate differences between three or more treatment conditions using ordinal data from a repeated-measures design

Answer 138

ANOVA requires interval or ratio scale scores that can be used to calculate means and variances The Friedman test, on the other hand, simply requires that you are able to rank order the individuals across treatments

Answer 139

A **Friedman test** can be used as the nonparametric alternative to a one-way repeated-measures ANOVA if the assumptions of the ANOVA are violated

Answer 140

Ordinal data

Answer 141

The Friedman test is similar to the Kruskal-Wallis test. However, the **Kruskal-Wallis test** is used for independent-measures designs, whereas the **Friedman test** is used for repeated-measures designs

Answer 142

The ranks in one treatment condition should not be systematically higher or lower than the ranks in any other treatment condition.

Answer 143

The ranks in at least one treatment condition should be systematically higher or lower than the ranks in another treatment condition.

Answer 144

1. Each individual (or the individual’s scores) must be ranked across the treatment conditions * i.e., for each participant, rank the scores in the treatment conditions from smallest to largest 2. Compute the sum of ranks (R) for each treatment condition * i.e., add up the ranks for each treatment condition 3. The following formula is used to compute the Friedman statistic – which is distributed as a chi-square statistic with degrees of freedom equal to the number of treatments minus one

Answer 145

If the null hypothesis is true, we would expect the sums of ranks (R’s) to be more or less equal. Thus, the Friedman statistic measures the degree to which the R’s differ from one another.

Answer 146

Wilcoxon signed-ranks test

Statistics Flashcards

(184 cards)