Lecture 2 Flashcards

Regression I

1
Q

Video 1

What is the general formula of all statistical models?

A

outcome = model + error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Video 1

Why does the regression model use the predictor?

A

To fit the linear association model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Video 1

What is different about the linear regression model?

A

It uses two continuous variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Video 1

What is the sampling theory (theory of distribution)?

A

That when you have a lot of samples from a population, they will give different means. You will find the right mean when you look at the mean that is most common.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Video 1

What does the SE let you know?

A

How much the SD deviates from the mean, how much you expect to be wrong. Calculated by SD: squareroot of n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Video 1

When can you expect a normal distribution of a sample?

A

When the n is bigger than 30

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Video 1

How is the df calculated?

A

number of observations - number of parameters estimated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Video 2

What does the regression line look like in a formula?

A

y=a+bx+e (error).
a= y at x=0
b= slope
e = difference between the true and the predicted value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Video 2

When doe you use a small ‘i’ in the formula?

A

When the value used is predicted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Video 2

What is the formula for the regression coefficient?

A

bylx = r*Sy/Sx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Video 2

When is a correlation symmetric?

A

When you standardize the predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Video 2

What is used in regression?

A

A predictor and the outcome, is scalable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Video 2

Which three types of causality can you name?

A

Reverse causality, ommited variable, confounder (other factor)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Video 3

How can the variance be explained?

A

By R^2. It tells you how well the dependent variable is explained by the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Video 3

What do you test during a hypothesis test?

A

Whether the regression coefficient is significant (with a t-test). And whether the model is a good fit for the data ( with F-test).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Video 3

Between which values does the R value range?

A

Between -1 and 1

17
Q

Video 3

With which formula do you get the R^2?

A

The variation of y from the regression line/ total variation of y

18
Q

Video 3

What is the use of an adjusted R^2?

A

It corrects the R^2 for the population instead of for the sample.

19
Q

Video 3

What is the SSt?

A

The sum of squares, shows the total error from y

20
Q

Video 3

What is the SSr?

A

The residual sum of squares, shows what is not explained by the regression line.

21
Q

Video 3

What is the SSm?

A

The model sum of squares, it shows what is explained by the regression line. SSm = SSt-SSr

22
Q

Video 3

In terms of SS…, what does the formula of R^2 look like?

23
Q

Video 3

If the slope of the null hypothesis is at b=0 is it one sided or two sided?

A

Then it would be two sided

24
Q

Video 3

If the slope of the null hypothesis is at b<=0 is it one sided or two sided?

A

It would be one sided then.

25
# Video 3 Is regression mainly one sided or two sided?
Two sided
26
# Video 3 How can the t-value be calculated in a formula?
(b observed - b H0)/ SE (b observed)
27
# Video 3 What is a different word for the standardized coefficient?
The correlation coefficient
28
# Video 3 What is a different word for the residuals?
The errors
29
# Video 3 When is an F-test used?
To show whether a regression model is better than a means model.
30
# Video 3 Do you prefer a high or a low F-test value?
A high one, because the higher the number the better it is in comparison to the means model.
31
# Video 3 How can the F value be calculated in a formual?
SSm/SSr
32
# Video 3 There are two df's in the F-test, what is the formula to calculate them?
(df m, df r) = (p, N-p-1)
33
# Video 4 What are the assumptions in regression?
Linearity of the IV and DV, normal residuals, independent observations, homoscedasticity, absence of outliers.
34
# Video 4 What is meant with normal residuals?
A normal distribution, if it is violated the SE is incorrect.
35
# Video 4 What is the consequence of not having independent observations?
It lowers the SE. Can notice it by plotting the data against the ID, or using a Durbin-Watson test.
36
# Video 4 What is meant with homoscedasticity?
That the DV variation is constant, you see it by plotting the residuals against the IV. If it is violated you have a biased SE. Can correct by WLS regression of adjusting the SE
37
# Video 4 Can regression handle minor violations?
Yes, it is robust to minor violations.
38
# Video 4 Why don't you do extrapolation of data collection?
It is 'dangerous'.
39
# Notes How can the measurement be standardized?
The mean is in the middle, all that differs from that forms the SD.