Lecture #9 (Regression) Flashcards

1
Q

What is linear regression?

A

Linear regression is finding the line of best fit in a scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What formula underpins linear regression?

A

y = a + bx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does each variable in the linear regression represent?

A
y = dependent variable
x = independent variable,
b = slope of the line 
a = intercept with the y axis and line
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we determine the centroid?

A

The centroid is defined by the mean of the x value and the y value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the centroid?

A

The centroid is the middle of the linear regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the slope determined?

A

The slope is determined by the sum of the square of the

distances between each point and the line is minimized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the line of best fit called?

A

Least squares regression line (LSRL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the steps to determine the Least Squares Regression Line (LSRL)?

A

1) for each (x,y) point calculate x² and xy
2) sum all x, y, x2 and xy, which gives us Σx, Σy, Σx² and Σxy
3) Calculate slope B
4) Calculate intercept A
5) assemble the equation of the line: y = a + bx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you calculate slope B?

A

b = (𝑛 (∑𝑥𝑦) −(∑𝑥∑𝑦)) / (𝑛∑ (𝑥²) −∑(𝑥)²)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you calculate intercept A?

A

a = (∑𝑦−𝑏∑𝑥) / 𝑛

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are residuals?

A

residuals are the difference between the value observed and the value expected by the model (error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a larger sum of residuals mean?

A

A less fit model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How should residuals be distributed?

A

Normally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Homoscedasticity is

A

Having the same scatter - points are approx. the same distance from the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Heteroscedasticity

A

Having a different scatter - points are widely varying

distances from the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly