Assumptions Flashcards

1
Q

Detections of variation from normality

A

1) Histograms

2) Normal quantile plots

Compares each observation in sample w/ corresponding quantile expected from standard normal distrib

3) Formal test of normality (Shapiro Wilk Test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Ways to feal with violations of assumptions

A

1) Ignore the violation of the assumptions

2) Transform the data – use a mathematical transformation method to alter the distribution.

3) Use a non‐parametric method – these methods calculate probabilities in a way that does not depend on whether the response variable has a normal distribution.

4) Use a permutation test – (or bootstrapping) – use a computer to repeatedly randomly sample your sample to produce a null distribution with a large sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Ignoring violations of assumptions

A

 A statistical method is robust if violations of its assumptions do not greatly affect its results.

One sample T-test:
- robust to skew if the sample size is large
- Never robust to outliers

Two sample T-test
- Robust to skew if skew of both samples is in the same direction and the sample size is above 30
- Robust to skew in different direction is the sample size is above 500
- Robust to difference in SD up to 3 fold as a long as sample size are equal and greater than 30
- never robust to outliers

F test
- Always required normal disitrbution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Transforming the data

A

Log tranformations
-> Right skewed data
-> group with larger mean also has alrger sd
-> data spans multiple magnitude

Arcsin tranformation
-> Proportion data

Sqrt tranformation
-> Count data
-> Right skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Non- Parametric tests

A

These methods calculate probabilities in a way that does not depend on whether the response variable has a normal distribution.

1) The Sign test (alternative to paired/ one sample T test)

This test compares the median of a sample to a constant specified in the null hypothesis

  • Calculate the difference
  • Assign +/- (if equal then ignore and reduce n)
  • Count the number below 0 (negative)
  • Do binomial disitrbution for below 0 values
  • Multiple by 2

2) Wilcoxon signed rank test (alternative to paired t test)

The Wilcoxon signed-rank test retains information about magnitudes—that is, how far above or below the hypothesized median each data point lies.

Assumes symetrical disitrbution

3) Mann- Whitney U-test (alternative to 2-sample T test)

This test compares the disitrbution of two groups.

  • Rank data and sum the ranks or each group
  • Calculate U for lower total
  • Find U from from stastical table using sample size
  • Compare to the smaller value of U calculated (to be safe)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Assumptions of Man-Whitney U test

A
  1. It assumes that the data are randomly sampled.
  2. Tests whether the data have different distributions. It is not a robust test of whether the data have the same measures of central tendency (i.e. means/medians).
  3. Mann‐Whitney can be used to test of similarity of means/medians only if the distributions have the same shape.
  4. Lower power (greater type II error) as not using all data availiable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Permutation test

A

A permutation test generates a null distribution for the association between two variables by repeatedly and randomly rearranging the values of one of the two variables in the data

  • Create permutated set of data
  • Repeat at least 1000 times
  • create null disitrbution
  • Calculate the proportion of values in the null distirbution that are as extreme of more extreme than the observed value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Assumptions of permutation test

A
  • The data must be a random sample from the population
  • For permutation tests that compare means or medians between groups, the distribution of the variable must have the same shape in every population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Parametric vs non-parametric

A

Parametric: Statistical methods—such as the one-sample, paired, and two-sample t-tests—that make assumptions about the distribution of variables

Non-parametric: Methods that do not make assumptions about the distribution of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Assumptions when using a normal distribution for statistical inference

A

1) Data are sampled at random (for response variables conditioned on explanatory variables)
2) Samples are independent.
3) The difference between observations & predictions are normally distributed.
4) The mean and variance of errors are independent of the explanatory variable(s).
5) One source of unmeasured random variance.
6) Variance among groups is equal (and if not, then you use an adjustment)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly