Statistics Flashcards

Question 1

Q

What is descriptive research?

Answer

A

Aim: to describe characteristics of a sample (what kind, how much etc)
Used to summarise, organise and simplify sample data
Often based on measurement of a single variable (univariate statistics)
Relies on measures of central tendency, frequencies, spread, distributionel shape, etc

Question 2

Q

What is inferential research?

Answer

A

Null hypothesis testing
Aim: to infer characteristics of the population
Often interested in multiple variables (bivariate, multivariate) - Relies on a wide range of different tests (e.g. correlation, regression, t-tests, ANOVA, chi square etc.)
Allows us to make probability statements about how confident we can be that our sample findings reflect the ”true” of things

Question 3

Q

Level of measurement: What are the two main types of variables?

Answer

A

Categorical
binary (2 levels)
nominal (3+ levels)
ordinal (ordered, no equal intervals)
Continuous (Interval, ratio)
Interval (ordered, equal intervals, no absolute zero)
Ratio (ordered, equal intervals, absolute zero)

Question 4

Q

How can you keep error to a minimum?

Answer

A

By making sure we use careful sampling strategies and use measures that are valid and reliable
Validity+reliability=credibility

Question 5

Q

What are the critical values of z-scores?

Answer

A

95% of z-scores lie between -1.96 and 1.96
99% of z-scores lie between -2.58 and 2.58
99.9% of z-scores lie between -3.29 and 3.29

Question 6

Q

What does the z-score represent?

Answer

A

The distance a particular observation is away from the mean, measured in standard deviations
The standard normal distribution has a mean of 0 and a standard deviation of 1

Question 7

Q

What are the two ways you can carry out inferential hypothesis-based research?

Answer

A

Correlational research (observing what naturally happens without interfering)
Experimental research (manipulationg one variable and observing the effect on another variable – can be used to infer cause/effect)

Question 8

Q

What are the two types of experimental designs?

Answer

A

Independent/between subject (different participants in different groups)
Dependent/repeated measures (same participants exposed to all conditions)

Question 9

Q

What is systematic variance?

Answer

A

Variation due to genuine effect
Variance that can be explained by our model
Signal/effect - What we want to measure

Question 10

Q

What is unsystematic variance?

Answer

A

Noise/error
Small differences in outcome due to unkown factors

Question 11

Q

What is the most important formula of all? :-)

Answer

A

outcome=(model)+error
the way that effect and error is measured varies for each type of statistical test
But for a test to be ”significant”, effect should be considerably greater that error (chance)

Question 12

Q

What is the null hypothesis?

Answer

A

What we actually tests in statistics
Assumes that there is no effect, then try to reject this
H0: no effect in the population

Question 13

Q

What is the alternative hypothesis?

Answer

A

What we’re really interested in, when trying to reject the null hypothesis
Can be:
Non-directional: H1: There is an effect in the population
Directional: H1: There is this effect in the population

Question 14

Q

What are significance tests for?

Answer

A

For determining whether to reject or fail to reject the null hypothesis
For determining by how many percents confidence we reject the null hypothesis (typically 95%, 99% or 99.9%)

Question 15

Q

What are the z-distribution, t-distribution, F-distribution etc?

Answer

A

Test statistics
A statistic for which we know how frequently different values occur
Theoretical sampling distributions that assume the null hypothesis
Test statistic=variance explained by the model (effect)/variance not explained by the model (error)

Question 16

Q

What is the confidence level for p<.05, p<.01 and p<.001?

Answer

A

P<.05=95%
P<.01=99%
P<.001=99.9%

Question 17

Q

If p is high (p>.05)…

Answer

A

… the null applies! :-)

Question 18

Q

If p is low (p<.05)…

Answer

A

… the null must go! :-)

Question 19

Q

What is the relationship between critical values, significance and confidence?

Answer

A

As critical value increase (gets further away from null), confidence increases
As confidence increases, p (probability of making a type 1 error) decreases
Confidence+p=1.0 or 100%

Question 20

Q

What are critical cut-offs dependent on?

Answer

A

Type of test - 1 vs 2 tailed significance
P level
Degrees of freedom (calculated different for different tests)

Question 21

Q

What is a type I error?

Answer

A

False positive
Saying there is an effect when there isn’t
”You’re pregnant” to a man :-)

Question 22

Q

What is a type II error?

Answer

A

False negative
Saying there isn’t an effect when there is
”You’re not pregant” to a pregnant woman :-)

Question 23

Q

What is NHSTP?

Answer

A

Null Hypothesis testing procedures
Black and white thinking -> limitations
We should take a middle ground, combining NHSTP and effect sizes

Question 24

Q

What is the point of confidence intervals?

Answer

A

Can be useful in helping us to estimate the range within which the true population mean (or some other parameter) would fall in most samples
Typically 95% (p<.05)

Question 25

Q

What is an effect size?

Answer

A

A standardized measure of the size of an effect
Comparable across studies
Not as reliant on the sample size

Question 26

Q

What are the effect sizes we’ve learned?

Answer

A

Pearson’s r
Cohen’s d
R²
Odds ratio
Cramer’s V

Question 27

Q

When and how should we test for normality?

Answer

A

For all parametric tests
Using the K-S/Shapiro-Wilks test

Question 28

Q

When and how should we test for homogeneity of variance?

Answer

A

Independent t-test
Independent ANOVA
Using Levene’s test

Question 29

Q

When and how should we test for sphericity?

Answer

A

Dependent ANOVA
Using Mauchly’s test

Question 30

Q

What do you usually assume, when assuming normality?

Answer

A

That the sampling distribution of the parameter (e.g. means, or mean differences for a dependent t-test), or the residuals for regression, are normal in shape
Not that the distribution of the sample data must be normal

Question 31

Q

What does the K-S and Shapiro Wilks tests tell you?

Answer

A

Significant test at p<.05=violation of normality
Non-significant test at p>.05= normality is OK
We want a non-significant test!

Question 32

Q

What do you do, if there’s a difference between K-S and Shapiro-Wilks?

Answer

A

Use shapiro-wilks :-)

Question 33

Q

What is the central limit theorem?

Answer

A

As sample size increases, the random sampling distribution tends towards a normal distribution regardless of the shape of the sample data
The tendency increases as sample size increases
Can usually be argued with a sample size of >30
For independent tests: at least 30 in each group
For dependent tests: at least 30 overall!
Can also be argued if K-S/Shapiro Wilks tests show problems with normality

Question 34

Q

What if there’s a problem with normality? :-(

Answer

A

If large sample size, argue to meet normality assumptions on the basis of the central limit theorem
If not: Use a transformation (consider bootstrapping)
Or: use a non-parametric test

Question 35

Q

What is the homogeneity of variance?

Answer

A

The assumption that the variance in the outcome variable is approximately equal at all levels (groupings) of the independent variable
If variance is approximately equal for all groups, there is homogeneity of variance
If variance is not equal across groups, there is heterogeneity of variance, and the assumption is vioalted

Question 36

Q

When is the homogeniety of variance relevant?

Answer

A

Independent designs
For independent t (independent t-tests)
For F tests (independent ANOVA)

Question 37

Q

How can we assess homogeneity of variance?

Answer

A

Using Levene’s test
Non-significant Levene’s test at p>.05=homogeneity of variance
We want a non-significant Levene’s test!

Question 38

Q

What if we violate the assumption of homogeneity?

Answer

A

For independent t-tests: if Levene’s test is significant, meaning there is heterogeneity of variance, we should report the t-statistic and degrees of freedom from the equal variances NOT assumed row in SPSS output
For Independent ANOVA: If Levene’s test is significant, meaning there is heterogeneity of variance, report corrected F and df values such as Welch’s F

Question 39

Q

What is the assumption of Sphericity?

Answer

A

Similar to the assumption of homogeneity, but for repeated measures designs
The variances of the differences between groups are expected to be equal

Question 40

Q

How do you test for sphericity?

Answer

A

Calculating the differences between each pair of conditions - Calculating the variance of these differences
Determining if the variances are approximately equal
(If variance 1=variance 2=variance 3…., the assumption of sphericity is met)

Question 41

Q

How can we assess sphericity?

Answer

A

Using Mauchly’s test
Non-significant Mauchly’s test p>.05=sphericity
We want a non-significant Mauchly’s test!

Question 42

Q

What if there’s a violation of sphericity?

Answer

A

If Mauchly’s test for sphericity is significant at p<.05, you should report your findings from a corrected row in the SPSS output

- (e.g. Greenhouse-Geisser or Huynh-Feldt correction)

Question 43

Q

Why do assumptions matter?

Answer

A

Many of the most common statistical tests (parametric tests) are only reliable if these assumptions are satisfied or corrections are made
If we use uncorrected parametric tests with problematic data, there is a greater risk of drawing inccorect conclusions (type I error, especially)

Question 44

Q

What has more power? Non-parametric or parametric tests?

Answer

A

Parametric tests!

Question 45

Q

What are the key factors in determining which test to use?

Answer

A

Aim of research
Level of measurement of IV and DV (categorical vs continuous)
Research design (for group tests - independent vs repeated measures)
Normality (e.g. K-S tests)
Sample size (to argue CLM for independent tests: 30 in each group, to argue CLM for dependent tests: at least 30 overall)
Homogeneity of variance and Sphericity
Post hoc tests (if ANOVA)

Question 46

Q

What is correlation?

Answer

A

Determining how two continuous variables are related
E.g. what relationship, if any, exists between number of hours spent studying for an exam and exam performance?
Correlation DOES NOT equal causation :-)

Question 47

Q

What is the most widely used correlation coefficient?

Answer

A

Pearson’ r
Ranges from -1 (perfect negative relationship) to +1 (perfect positive relationship

Question 48

Q

What is Cohen’s rule of thumb for Pearson’s r?

Answer

A

r_>_.1 (small effect)

r_>_.3 (medium effect)

r_>_.5 (large effect)

Question 49

Q

What is Cohen’s rule of thumb for Cohen’s d?

Answer

A

d_>_0.2 (small effect)

d_>_0.5 (medium effect)

d_>_0.8 (large effect)

Question 50

Q

What is Cohen’s rule of thumb for Odds Ratio?

Answer

A

OR > 1.49 (small effect)

OR > 3.49 (medium effect)

OR > 9.0 (large effect)

Question 51

Q

How do you generally report results using APA format?

Answer

A

1: State the type of analysis you conducted
2: state the overall finding in normal words (including mean or Mdn, SD)
3: report the DF and test statistic (F, t etc)
4: report the significance level (p)
5: report effect size (including direction) (fx r, d, OR)

Brainscape's Knowledge GenomeTM

Statistics Flashcards

Brainscape's Knowledge Genome^TM