Chapter 3 lecture Flashcards

1
Q

difference regression and correlation

A

a regression line predicts the dependent variable y on the basis of the independent variable x. It describes the relationship between x and the estimated values of y at the various levels of x.

a correlation describes the strength of the association between y and x. It indicates to what extend the data points deviate from the regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what happens if r increases

A

then the data points will be closer to the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

correlation cannot be used to..

A

describe the lineair relationship between a and b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

intercept =

A

a, value of y when x=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

slope =

A

b, how much y changes if x increases with 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

positive b = … association, negative b = … associaton

A

positive, negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

u should draw the regression line so that…

A

there are as many dots above as below the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

residual

A

difference between the observation and the prediction (which is the regression line)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

choose the line with the…

A

smallest sum of squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

sum of squares

A

sum van (y-y^)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

hoe heet de lijn met de least sum of squares

A

least squares line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

if we do not know x, what is the best guess for y

A

average y!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

dus wat is het idee achter een regressieanalyse

A

kijken naar het verschil tussen het average (zonder x) en de regressielijn. how much is the error decreased by adding the predictor?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

wat is r2

A

the proportional decrease in the prediction error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

wat betekenen large/small r2?

A

large r2 -> groot verschil tussen average and prediction. dit betekent dat je een goede predictielijn hebt!
small r2 -> klein verschil tussen average and prediction. dit betekent dat je een slechte predictielijn hebt, je had net zo goed het gemiddelde kunnen gebruiken! dat geeft dan dezelfde informatie.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

r2 formule

A

(total SS - RSS)/total SS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

wat is SS

A

SS= total sum of squares = total variance:

vanaf punt tot average/mean (rechte lijn) = 𝒚-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

wat is RSS

A

RSS = residual sum of squares

vanaf punt tot regression line = y^

RSS -> R, dus vanaf REGRESSION prediction line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

wat is de relatie tussen total ss en rss

A

de RSS zit verwerkt in de total SS

20
Q

total sum of squares (SS) = (formule kort)

A

SS (y-y-) = RSS (y-y^) + SSR (y^- y-)

21
Q

tekening relatie SS, SSR, RSS

A

—————- SS (tot mean/average)
———– RSS (tot regressielijn)
——(SSR) (tot regressielijn rest)

22
Q

good regression model =

A

small RSS
large r2
regression model does better than simply predicting the mean.

23
Q

r2 uitleg definitie

A

the proportion of variance in y explained by x

24
Q

hoe kan je r ook berekenen

A

regression coefficient ^2

25
Q

de regressielijn moet de … mogelijke RSS hebben

A

LAAGST (dan ligt het data punt het dichtste bij de predictielijn)

26
Q

kijken naar samenvatting tekening van rss etc.

A

oke echt doen he

27
Q

wat als RSS ongeveer = SS

A

dan r2 = 0

28
Q

wat als SS > RSS

A

r2 > 0 (goed)

29
Q

wat is de predictor van ss

A

de mean

30
Q

wat is de predictor van rss

A

de regressielijn

31
Q

hoe kan je de value van r2 interpreteren

A

“the variance around the regression line is … % (/100!) less than the total variance.”

32
Q

wat als je de regressie voor de populatie wil testen

A

𝜇𝑦 =𝛼+𝛽𝑥

33
Q

assumpties van hypothese test regressie

A
  • random
  • x and y are linear related
  • for every value of x, y is normally distributed with the same standard deviation
34
Q

wat is H0 en Ha bij regressielijn testen

A

H0: 𝛽= 0 vs HA: 𝛽≠ 0

(of one sided)

35
Q

wat is de mean en sd van de sampling distribution

A

mean = 𝛽
standard deviation = se

36
Q

hoe p value van slope berekenen

A

2*t.dist!!

37
Q

hoe bereken je 𝑡.025

A

t.inv(0,975;df) !!!!!

38
Q

the closer the data points are to the regression line…

A

the smaller RSS
the larger r2
the smaller SSR

39
Q

explained variance

A

Explained variance does not
actually mean that we
have explained anything, at least
not in a causal sense. It simply
means that we can use one or
more variables to predict things
more accurately than we could
before. It is the proportion by
which the variance of the
prediction errors shrinks.

40
Q

regression and correlation difference kort

A

regression indicates what a relationship looks like and how you can predict the variable y.
correlation indicates how strong the relationship is.

41
Q

t becomes (slope and standard error)

A

larger with larger scope
smaller with larger standard error

42
Q

r2 interpretatie kort

A

..% of the variance of y can be explained by x

43
Q

wat als regressielijn helemaal verticaal is

A

r=0

44
Q

residual ander woord

A

prediction error!

45
Q

relationship between b and r

A

correlation r is the standardized version of b