Week 9 Lecture 9 - regressions Flashcards

1
Q

When is linear regression used?

A

when looking at the relationship between two variables
best described with a straight line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does linear regression differ to correlation?

A

linear regression proposes a model in which you can make estimates from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are x and y in a linear regresion?

A

y = variable being predicted (outcome variable)
x = variable used to predict (predictor variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a linear regression make the assumption of?

A

y is dependent on x (to some extent)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does dependency reflect causal dependency?

A

no just provides evidence for it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 stages of a linear regression?

A
  1. analyse relationship between variables
  2. propose a model to explain relationship
  3. evaluate model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does stage 1 of a linear regression involve?

A

view data in scatterplot, view r value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does stage 2 of a linear regression involve?

A

the regression line (line of best fit) –> where deviation from data points is smallest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the properties of a regression line?

A

a.) the intercept
b.) the gradient (slope)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does stage 3 of a linear regression involve?

A

assessing goodness-of-fit
how much is the simplest model better than the regression model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does SSt - SSr equal?

A

SSm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

the larger SSm the what?

A

the bigger the improvement in prediction using the regression model over the simplest model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do we use to evaluate SSm relative to SSr?

A

F-test (ANOVA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What values does an f-test work in?

A

mean sum values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the f-ration provide?

A

a measure of how much the model has improved the prediction of y (outcome) relative to the level of inaccuracy of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If the regression model is a good fit what will we see?

A

MSm will be large, MSr will be small
F value will be further from 0

17
Q

What is the null hypothesis in regressions?

A

regression model and simplest model are equal

18
Q

What are the assumptions for simple linear regression?

A
  • linearity
  • absence of outliers
  • normality, linearity, homoscedasticity, independence of residuals
19
Q

How do you assess normality, linearity, homoscedasticity, independence of residuals?

A

consider normal p-p plot –> looking for a straight diagonal

consider residual scatterplot –> looking for a rectangle with clusters towards the centre

20
Q

How can you check whether a data point is an outlier?

A

check residuals scatterplot
outlier if falls >3.3 or <-3.3

21
Q

Is there a parametric equivalent for linear regressions?

A

no

22
Q

What is r^2 in regressions?

A

the amount of variance in y explained by the model relative to the total variance in y
can be expressed as a percentage

23
Q

What is the equation for the regression line in simple linear regression?

A

y = bx + a

24
Q

What are multiple regressions?

A
  • assess the influence of several predictors on y
  • obtain a measure of how much variance in y the predictor variables combined explain
  • obtain measures of how much variance in y the predictor variables explain when considered separately
25
Q

What is the regression equation for multiple linear regression?

A

y = b1x1 + b2x2 + b3x3 … + a

26
Q

What are the 3 stages of a multiple regresson?

A

1.) analyse relationships
2.) propose a model that is on a plane of best fit
3.) evaluate the model

27
Q

What are the assumptions for multiple regressions?

A
  • sufficient sample size
  • linearity
  • absence of outliers
  • multicollinearity
  • normality, linearity, homoscedasticity, independence of residuals
28
Q

What are the formulas for determining sufficient sample size?

A

if considering combined effects only:
N >= 50 + 8 m

if also considering separate effects:
N >= 104 + m

29
Q

What is multicollinearity?

A
  • x’s correlated with y but not with one another
  • check using correlation matrix
  • highly correlated x’s (r>.9) can either be combined or eliminated as you are effectively measuring the same thing