Statistics Flashcards
What is descriptive research?
- Aim: to describe characteristics of a sample (what kind, how much etc)
- Used to summarise, organise and simplify sample data
- Often based on measurement of a single variable (univariate statistics)
- Relies on measures of central tendency, frequencies, spread, distributionel shape, etc
What is inferential research?
- Null hypothesis testing
- Aim: to infer characteristics of the population
- Often interested in multiple variables (bivariate, multivariate) - Relies on a wide range of different tests (e.g. correlation, regression, t-tests, ANOVA, chi square etc.)
- Allows us to make probability statements about how confident we can be that our sample findings reflect the ”true” of things
Level of measurement: What are the two main types of variables?
- Categorical
- binary (2 levels)
- nominal (3+ levels)
- ordinal (ordered, no equal intervals)
- Continuous (Interval, ratio)
- Interval (ordered, equal intervals, no absolute zero)
- Ratio (ordered, equal intervals, absolute zero)
How can you keep error to a minimum?
- By making sure we use careful sampling strategies and use measures that are valid and reliable
- Validity+reliability=credibility
What are the critical values of z-scores?
- 95% of z-scores lie between -1.96 and 1.96
- 99% of z-scores lie between -2.58 and 2.58
- 99.9% of z-scores lie between -3.29 and 3.29
What does the z-score represent?
- The distance a particular observation is away from the mean, measured in standard deviations
- The standard normal distribution has a mean of 0 and a standard deviation of 1
What are the two ways you can carry out inferential hypothesis-based research?
- Correlational research (observing what naturally happens without interfering)
- Experimental research (manipulationg one variable and observing the effect on another variable – can be used to infer cause/effect)
What are the two types of experimental designs?
- Independent/between subject (different participants in different groups)
- Dependent/repeated measures (same participants exposed to all conditions)
What is systematic variance?
- Variation due to genuine effect
- Variance that can be explained by our model
- Signal/effect - What we want to measure
What is unsystematic variance?
- Noise/error
- Small differences in outcome due to unkown factors
What is the most important formula of all? :-)
- outcome=(model)+error
- the way that effect and error is measured varies for each type of statistical test
- But for a test to be ”significant”, effect should be considerably greater that error (chance)
What is the null hypothesis?
- What we actually tests in statistics
- Assumes that there is no effect, then try to reject this
- H0: no effect in the population
What is the alternative hypothesis?
- What we’re really interested in, when trying to reject the null hypothesis
- Can be:
- Non-directional: H1: There is an effect in the population
- Directional: H1: There is this effect in the population
What are significance tests for?
- For determining whether to reject or fail to reject the null hypothesis
- For determining by how many percents confidence we reject the null hypothesis (typically 95%, 99% or 99.9%)
What are the z-distribution, t-distribution, F-distribution etc?
- Test statistics
- A statistic for which we know how frequently different values occur
- Theoretical sampling distributions that assume the null hypothesis
- Test statistic=variance explained by the model (effect)/variance not explained by the model (error)
What is the confidence level for p<.05, p<.01 and p<.001?
- P<.05=95%
- P<.01=99%
- P<.001=99.9%
If p is high (p>.05)…
- … the null applies! :-)
If p is low (p<.05)…
- … the null must go! :-)
What is the relationship between critical values, significance and confidence?
- As critical value increase (gets further away from null), confidence increases
- As confidence increases, p (probability of making a type 1 error) decreases
- Confidence+p=1.0 or 100%
What are critical cut-offs dependent on?
- Type of test - 1 vs 2 tailed significance
- P level
- Degrees of freedom (calculated different for different tests)
What is a type I error?
- False positive
- Saying there is an effect when there isn’t
- ”You’re pregnant” to a man :-)
What is a type II error?
- False negative
- Saying there isn’t an effect when there is
- ”You’re not pregant” to a pregnant woman :-)
What is NHSTP?
- Null Hypothesis testing procedures
- Black and white thinking -> limitations
- We should take a middle ground, combining NHSTP and effect sizes
What is the point of confidence intervals?
- Can be useful in helping us to estimate the range within which the true population mean (or some other parameter) would fall in most samples
- Typically 95% (p<.05)
What is an effect size?
- A standardized measure of the size of an effect
- Comparable across studies
- Not as reliant on the sample size
What are the effect sizes we’ve learned?
- Pearson’s r
- Cohen’s d
- R2
- Odds ratio
- Cramer’s V
When and how should we test for normality?
- For all parametric tests
- Using the K-S/Shapiro-Wilks test
When and how should we test for homogeneity of variance?
- Independent t-test
- Independent ANOVA
- Using Levene’s test
When and how should we test for sphericity?
- Dependent ANOVA
- Using Mauchly’s test
What do you usually assume, when assuming normality?
- That the sampling distribution of the parameter (e.g. means, or mean differences for a dependent t-test), or the residuals for regression, are normal in shape
- Not that the distribution of the sample data must be normal
What does the K-S and Shapiro Wilks tests tell you?
- Significant test at p<.05=violation of normality
- Non-significant test at p>.05= normality is OK
- We want a non-significant test!
What do you do, if there’s a difference between K-S and Shapiro-Wilks?
- Use shapiro-wilks :-)
What is the central limit theorem?
- As sample size increases, the random sampling distribution tends towards a normal distribution regardless of the shape of the sample data
- The tendency increases as sample size increases
- Can usually be argued with a sample size of >30
- For independent tests: at least 30 in each group
- For dependent tests: at least 30 overall!
- Can also be argued if K-S/Shapiro Wilks tests show problems with normality
What if there’s a problem with normality? :-(
- If large sample size, argue to meet normality assumptions on the basis of the central limit theorem
- If not: Use a transformation (consider bootstrapping)
- Or: use a non-parametric test
What is the homogeneity of variance?
- The assumption that the variance in the outcome variable is approximately equal at all levels (groupings) of the independent variable
- If variance is approximately equal for all groups, there is homogeneity of variance
- If variance is not equal across groups, there is heterogeneity of variance, and the assumption is vioalted
When is the homogeniety of variance relevant?
- Independent designs
- For independent t (independent t-tests)
- For F tests (independent ANOVA)
How can we assess homogeneity of variance?
- Using Levene’s test
- Non-significant Levene’s test at p>.05=homogeneity of variance
- We want a non-significant Levene’s test!
What if we violate the assumption of homogeneity?
- For independent t-tests: if Levene’s test is significant, meaning there is heterogeneity of variance, we should report the t-statistic and degrees of freedom from the equal variances NOT assumed row in SPSS output
- For Independent ANOVA: If Levene’s test is significant, meaning there is heterogeneity of variance, report corrected F and df values such as Welch’s F
What is the assumption of Sphericity?
- Similar to the assumption of homogeneity, but for repeated measures designs
- The variances of the differences between groups are expected to be equal
How do you test for sphericity?
- Calculating the differences between each pair of conditions - Calculating the variance of these differences
- Determining if the variances are approximately equal
- (If variance 1=variance 2=variance 3…., the assumption of sphericity is met)
How can we assess sphericity?
- Using Mauchly’s test
- Non-significant Mauchly’s test p>.05=sphericity
- We want a non-significant Mauchly’s test!
What if there’s a violation of sphericity?
- If Mauchly’s test for sphericity is significant at p<.05, you should report your findings from a corrected row in the SPSS output
- (e.g. Greenhouse-Geisser or Huynh-Feldt correction)
Why do assumptions matter?
- Many of the most common statistical tests (parametric tests) are only reliable if these assumptions are satisfied or corrections are made
- If we use uncorrected parametric tests with problematic data, there is a greater risk of drawing inccorect conclusions (type I error, especially)
What has more power? Non-parametric or parametric tests?
Parametric tests!
What are the key factors in determining which test to use?
- Aim of research
- Level of measurement of IV and DV (categorical vs continuous)
- Research design (for group tests - independent vs repeated measures)
- Normality (e.g. K-S tests)
- Sample size (to argue CLM for independent tests: 30 in each group, to argue CLM for dependent tests: at least 30 overall)
- Homogeneity of variance and Sphericity
- Post hoc tests (if ANOVA)
What is correlation?
- Determining how two continuous variables are related
- E.g. what relationship, if any, exists between number of hours spent studying for an exam and exam performance?
- Correlation DOES NOT equal causation :-)
What is the most widely used correlation coefficient?
- Pearson’ r
- Ranges from -1 (perfect negative relationship) to +1 (perfect positive relationship
What is Cohen’s rule of thumb for Pearson’s r?
r_>_.1 (small effect)
r_>_.3 (medium effect)
r_>_.5 (large effect)
What is Cohen’s rule of thumb for Cohen’s d?
d_>_0.2 (small effect)
d_>_0.5 (medium effect)
d_>_0.8 (large effect)
What is Cohen’s rule of thumb for Odds Ratio?
OR > 1.49 (small effect)
OR > 3.49 (medium effect)
OR > 9.0 (large effect)
How do you generally report results using APA format?
1: State the type of analysis you conducted
2: state the overall finding in normal words (including mean or Mdn, SD)
3: report the DF and test statistic (F, t etc)
4: report the significance level (p)
5: report effect size (including direction) (fx r, d, OR)