8. Comparing means: assumptions and transformations Flashcards

Question 1

Q

what are the assumptions of of statistical interference from a normal distribution

Answer

A

Data are sampled at random
■ for response variables conditioned on explanatory variables

Samples are independent.

The difference between observations and predictions are normally distributed.

The mean and variance of errors are independent of the explanatory variable(s).

One source of unmeasured random variance.

Variance among groups is equal
■ and if not, then you use an adjustment

Question 2

Q

what methods can you use when your response variable does not have a
normal distribution

Answer

A

First check how much it deviates from normality.
- use a normal quantile plot
- and a Shapiros-Wilk’s test

Then you can either:

ignore the violations of the assumptions
transform the data
use a non-parametric method
use a permutation test

Question 3

Q

What does a Shapiros-Wilk’s test do?

Answer

A

Evaluates the goodness of fit of a normal distribution

Can quantify deviation from normality
■ Doesn’t tell you whether the data is normally distributed tho

Question 4

Q

When is it sensible to ignore the violated assumptions of a normal distribution?

Answer

A

when using robust statistics (Central limit theory)
if the shape of distribution is similar
when using an accepted adjustment for difference in var/s.d (welch’s t-test)

Question 5

Q

What is central limit theory?

Answer

A

Sum of mean of large random sample from a non-normal population is approximately normally distributed

Question 6

Q

How can we transform data and when do we do each?

Answer

A

log
- ratios and products of variables

nat-log
- ratios and products of variables
- skewed freq distribution
- group with larger mean also has larger s.d
- data spans several orders of mag

arc-sin
- proportions

sqr-root
- counts and right-skewed

Reciprocal
- right skewed

Exponential
- left skewed

sqr
- left skewed

Question 7

Q

What does a non-parametric method do?

Answer

A

calculates probabilities in a way that does not depend on normality of response variable

less powerful though

Question 8

Q

What are two non-parametric methods?

Answer

A

Sign test

Mann-Whitney U-test

Question 9

Q

How do you do a sign test?

Answer

A

Turn the difference in data points
into a binomial data set

calc difference
assign ‘+’ or ‘-‘ based on if it is > or < 0
count number of ‘+’ and ‘-‘
H0 expects #’+’ == #’-‘
use binomial distribution to calculate p-value for test

Question 10

Q

How do you do a Mann-Whitney U-test?

Answer

A

order all the data into smallest to largest
give each a ranks starting with 1
(if same rank use the average e.g. 3 and 4 become 3.5)
calculate rank sum of each group
calculate u-statistic for each group
larger the u statistic is used as the test static
compare to cv from a table

Question 11

Q

What are the assumptions of the Mann-Whitney U-test?

Answer

A

data is randomly sampled
tests if data has different distributions
(not robust to test for same central tendencies)
distribution is same shape
low power due to not using all data
(greater type 2 error)

Question 12

Q

What is a permutation test (what else is it known by)?

Answer

A

use of a computer to repeatedly randomly sample your sample
to produce a null distribution with a large sample size

aka bootstrapping

Question 13

Q

What are the steps of a permutation test?

Answer

A

Create response variable that are randomly re‐ordered.

Calculate the measure of association for the
permuted sample
● (e.g. the difference in means, medians, etc.)

Repeat the permutation process many times
● at least 1000 or more to create a null
distribution

Compare to observed value of test static calculated from original data set

8. Comparing means: assumptions and transformations Flashcards

(13 cards)