Assumptions Flashcards

Question 1

Q

Detections of variation from normality

Answer

A

1) Histograms

2) Normal quantile plots

Compares each observation in sample w/ corresponding quantile expected from standard normal distrib

3) Formal test of normality (Shapiro Wilk Test)

Question 2

Q

Ways to feal with violations of assumptions

Answer

A

1) Ignore the violation of the assumptions

2) Transform the data – use a mathematical transformation method to alter the distribution.

3) Use a non‐parametric method – these methods calculate probabilities in a way that does not depend on whether the response variable has a normal distribution.

4) Use a permutation test – (or bootstrapping) – use a computer to repeatedly randomly sample your sample to produce a null distribution with a large sample size

Question 3

Q

Ignoring violations of assumptions

Answer

A

 A statistical method is robust if violations of its assumptions do not greatly affect its results.

One sample T-test:
- robust to skew if the sample size is large
- Never robust to outliers

Two sample T-test
- Robust to skew if skew of both samples is in the same direction and the sample size is above 30
- Robust to skew in different direction is the sample size is above 500
- Robust to difference in SD up to 3 fold as a long as sample size are equal and greater than 30
- never robust to outliers

F test
- Always required normal disitrbution

Question 4

Q

Transforming the data

Answer

A

Log tranformations
-> Right skewed data
-> group with larger mean also has alrger sd
-> data spans multiple magnitude

Arcsin tranformation
-> Proportion data

Sqrt tranformation
-> Count data
-> Right skewed

Question 5

Q

Non- Parametric tests

Answer

A

These methods calculate probabilities in a way that does not depend on whether the response variable has a normal distribution.

1) The Sign test (alternative to paired/ one sample T test)

This test compares the median of a sample to a constant specified in the null hypothesis

Calculate the difference
Assign +/- (if equal then ignore and reduce n)
Count the number below 0 (negative)
Do binomial disitrbution for below 0 values
Multiple by 2

2) Wilcoxon signed rank test (alternative to paired t test)

The Wilcoxon signed-rank test retains information about magnitudes—that is, how far above or below the hypothesized median each data point lies.

Assumes symetrical disitrbution

3) Mann- Whitney U-test (alternative to 2-sample T test)

This test compares the disitrbution of two groups.

Rank data and sum the ranks or each group
Calculate U for lower total
Find U from from stastical table using sample size
Compare to the smaller value of U calculated (to be safe)

Question 6

Q

Assumptions of Man-Whitney U test

Answer

A

It assumes that the data are randomly sampled.
Tests whether the data have different distributions. It is not a robust test of whether the data have the same measures of central tendency (i.e. means/medians).
Mann‐Whitney can be used to test of similarity of means/medians only if the distributions have the same shape.
Lower power (greater type II error) as not using all data availiable

Question 7

Q

Permutation test

Answer

A

A permutation test generates a null distribution for the association between two variables by repeatedly and randomly rearranging the values of one of the two variables in the data

Create permutated set of data
Repeat at least 1000 times
create null disitrbution
Calculate the proportion of values in the null distirbution that are as extreme of more extreme than the observed value

Question 8

Q

Assumptions of permutation test

Answer

A

The data must be a random sample from the population
For permutation tests that compare means or medians between groups, the distribution of the variable must have the same shape in every population

Question 9

Q

Parametric vs non-parametric

Answer

A

Parametric: Statistical methods—such as the one-sample, paired, and two-sample t-tests—that make assumptions about the distribution of variables

Non-parametric: Methods that do not make assumptions about the distribution of variables

Question 10

Q

Assumptions when using a normal distribution for statistical inference

Answer

A

1) Data are sampled at random (for response variables conditioned on explanatory variables)
2) Samples are independent.
3) The difference between observations & predictions are normally distributed.
4) The mean and variance of errors are independent of the explanatory variable(s).
5) One source of unmeasured random variance.
6) Variance among groups is equal (and if not, then you use an adjustment)

Assumptions Flashcards

(10 cards)