13. Dealing with assumptions Flashcards

Question 1

Q

Assumptions of t-tests

Answer

A

random samples
populations are normally distributed
TWO SAMPLE T-TEST: populations have equal variances

Question 2

Q

Can we statistically fix random sampling?

Question 3

Q

Use of histograms for normality

Answer

A

Not normal if skewed, asymmetrical
**especially clear with a large data set

Question 4

Q

How can you check in something is normally distributed?

Answer

A

Check previous data/theory

Plot on a histogram - bell curve shape!

Quantile plot (QQ plot) expect dots to fall along a straight line if normal!

Shapiro-Wilk test (formal test of normality) - not particularly useful in deciding what test to use

Question 5

Q

what are quantiles

Answer

A

divides range of probability distribution into continuous intervals w/ equal probabilities

10% below this value
20% below this value
30% below this value

Question 6

Q

Shapiro-Wilk test!

Answer

A

Used to test statistically whether a set of data comes from a normal distribution

NOT a good thing to decide if t-test BECAUSE more likely to reject a false null hypothesis when lots of data, BUT mostly we care about distribution of sample means being normally distrib, therefore would be fine to do other tests BUT shapiro-wilk would say no!

Question 7

Q

Strategies if NOT normal

Answer

A

if sample size is large, sometimes parametric tests work OK anyway!

transformations (ex log) - new set of values that may fit assumptions

non-parametric tests - makes fewer assumptions about distributions data came from, often based on ranks

permutation tests - asks if theres association btwn two variables, mix up variables and find association you would get by chance, compare to actual association!

bootstrapping

Question 8

Q

rule of thumb for sample size

Answer

A

if sample size > ~50, the normal approximations may work

Question 9

Q

Really great test:

Answer

A

Welch’s t-test

Question 10

Q

If sample sizes are equal and large, what sort of difference in variance is APPROXIMATELY OK

Answer

A

ten-fold difference!

Question 11

Q

Requirements of data transformations

Answer

A

Same transformation applied to each individual (for a specific variable)

One to one correspondence to original values/transformed values - NOT absolute values!!! have to be able to go backwards!

monotonic relationship w/original values (ex large values stay larger)

Question 12

Q

Non-Parametric Tests

Answer

A

Assume less than parametric about underlying distributions

Most often RANK each data point in all samples from lowest to highest

Question 13

Q

Log transformation

Answer

A

Y’ = ln[Y]

Good when variable is likely to be the result of multiplication or division of various components - what was multiplicative, becomes additive

EX. growth - grow 10% a year not +10 mm, so log normal not normal normal!

Good for RIGHT skewed data, not left!!!

Good when variance becomes larger in groups where mean is larger

Question 14

Q

Test to compare central tendencies of two groups using ranks

Answer

A

Mann-Whitney U test

Question 15

Q