LECTURE 2 linear regression Flashcards

1
Q

what is regression

A

way of predicting one variable from another - hypothetical model of the linear relationship between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

equation of a straight line

A

outcome (y) = [model] + error

y = b0 + b1X1 + error

b0 = intercept value when x=0 (crosses y)

b1 = regression coefficient for predictor - gradient and direction of relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

fitting the model - method of least squares

A

method of least squares tries to minimize error within a model by providing a line of best fit - difference between data points and line
regression line may not reflect reality so must be tested for fitting data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

define sum of squares

A

data points compared to their own group means

does nto account for much variance as not against overall grand mean (null hyp)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

define total sum of squares (SSt)

A

total variability within data according to all points against the grand mean - subtract each data from grand mean to give idea of total variance in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

define model sum of squares (SSm)

A

how the data deviates from the grand mean - deviations between the grand mean and the regression model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

define residual sum of squares (SSr)

A

whatever variance is left unaccounted for - deviations of the data from the regression model line (SSt - SSm)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you test the regression model

A

is the regression a better reflection of the data than the grand mean (null)
if so - SSm > SSr

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ANOVA f value

A

measure of the mean squared error (averages of the sum of square values)
MSm/MSr = F

want model to account for more variance than error/chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ANOVA r2 value

A

what proportion of the variance can be accounted for by the regression model - use pearsons correlation coefficient

r2 = SSm(variance model accounts for)/SSt(all variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

why use a histogram of standardised residuals

A

check for outliers - +- 3SD from mean (large residual means mismatch between what is observed and what is predicted)

check if normally distributed and therefore meets assumption of regression analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

problem with regression

A

not symmeric - regressing Y on X not the same as then regression X on Y - CANT FLIP THE EQUATION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly