lec 4 /5 /6/7 Flashcards

1
Q

What are CI?

A

suggest that 95% of sample CI would be expected to include true population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do we use for computer based random sampling

A

monte carle methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

sampling without replacement

A

permutation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

sampling with replacement and what is this?

A

Bootstrapping - values can be picked at random more than once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a residual?

A

the diff between the observed value of the dv and the predicted value. Each data point has one residual, it is the error/unexplained variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the line of best fit/

A

finding a slope and intercept that minimise variation of data around the line (in other words minimise the residuals)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how to calc total vairance

A

variation predicted by x + unexplained variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

if x doesnt predict y, what is the total variance?

A

unexplained variation by x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

if x predicts y well, what is the total variance?

A

variation predicted by x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When to use reduced major axis or major axis?

A

if both x and y have error and the ‘true’ relationship is of interest (they must be correlated)

MA if variances similar

Reduced Major Axis (RMA) Regression
if variances unequal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what to use to predict y from x?

A

Ordinary Least Squares - this requires norm distribution of residuals not x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what to do with non-linear relationship?

A

transform to data to linear
fit functions using maximum likelihood
or polynomial regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

which contrast does this desribe? ‘if squared, values will be independent’

A

polynomial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

why is a balanced design good?

A

it is orthogonal and so increases power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

assumption of mixed model is that

A

repeated measures must be uncorrelated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why might you chose to use monte Carle methods over non parametric ?

A

When data is skewed and the two have diff shapes, wilcoxin / Mann Whitney u assume they have the same shape

17
Q

Permutation test

A

Assumes that if groups aren’t different, then group membership should be arbitrary, therefore re labels groups lots and lots of times and this shouldn’t make a difference to group means or medians

18
Q

What is the p value in permutation test?

A

The probability of the observed mean if the null hypothesis is true

19
Q

Why would you use permutation over para or non para? What are the advantages?

A

Not limited to measures of location like mean and median, can use any measure to do calculations on probability from the sample

20
Q

Pros and cons of permutation test

A

Good - no distribution assumptions, can be customised for any problem, can see where p value comes from

Bad - unfamiliar to referees, sums can’t be readily checked, need to be customised

21
Q

How to do permutation on anova type (multi group differences)

A

Can use the average difference between group means/medians

Or can use F (total variance - within group variance / within group variance) - f usually requires norm dist but not in this case

Can also use order of means and then find the probability that this order occurs by chance

22
Q

Wilcoxin is distribution free but not ..

A

Assumption free

23
Q

If data can’t take values <0 what do we do? Eg number of frogs in a pond.. not sensible to transform

A

Estimate the mean and ci as this would be more useful to see where population will lie, can use maximum likelihood - pick known distribution that matches sample and estimate mean/median and ci
(The sample mean is a maximum likelihood estimate of population mean)
Can still do this if distribution is not normal! Stats package does for you

or if distribution isn’t like any off the shelf distribution use bootstrapping (but this does require large sample, not so good for hypothesis testing)

24
Q

What are CI?

A

95% of sample CI will include the true population mean

25
Q

Permutation vs bootstrapping

A

Permutation tests a shuffle of the sample,

Bootstrapping re-samples with replacement

26
Q

How large does sample need to be for bootstrapping?

A

> 50

27
Q

what is centering?

A

when there is multiple colinearity in polynomial regression so Use (x – mean(x))2 so there is no multiple colinearity

28
Q

why do reg and ANOVA get diff results when there is more than 2 groups?

A

If we have more than two groups, then the regression and the ANOVA will yield different results: regression will fit a straight line through all three groups (and thus not necessarily joining the means), whereas ANOVA fits separate means to each group.

ANOVA tests for significance of the two differences between the three means

Regression tests for significance of a single line fitted through all the data

29
Q

what is ordinary least squares?

A

is a method for estimating the unknown parameters with the goal of minimizing the sum of the squares of the differences between the observed responses and the predicted