Week 2: finding the quantitative relationship between 2 variables Flashcards

1
Q

What principle do we use when we estimate b0 and b1 (using their formulas) ?

A

the Least Square principle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the least square principle guarantee?

A

that the regression line is the best fit of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the b0 and b1 equations derived from?

A

minimising the sum of the squares of the vertical distances between the observed Yi and predicted Ŷi values of the Dependent Variable:

min∑(Yi−Ŷ)^2 = min∑(Yi−(b0+b1Xi))^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the least square principles guarantee that?

A
  • that the regression line obtained has the smallest sum of squared residuals
  • a regression line is the best approximation to the quantitative relationship existing between the variable Y
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What assumptions under-lie linear regression? (4)

A

Linearity

Independence of Errors

Normality of Error

Equal Variance (AKA homoscedasticity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the linearity assumption?

A

the relationship between X and Y is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the ‘independence of errors’ assumption?

A

error values are statistically independent

this is particularly important when data is collected over a period of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the ‘normality of error’ assumption?

A

error values are normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the ‘Equal Variance’ assumption?

A

the probability distribution of the errors has constant variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the residual for the observation i, ei??

A

the difference between its observed and predicted value

ei = Yi - Ŷi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you check the assumptions of regression?

A

by examining the residuals:

-examine for linearity assumption
-evaluate independence assumption
-evaluate normality assumption
-examine for constant variance for all levels of X (homoscedasticity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How would you do a graphical analysis of residuals to investigate the assumptions?

A

plot residuals vs X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens to the histogram of the residuals when the assumption of Normality is satisfied?

A

the histogram of the residuals approximate the bell shape of a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why do we need to compare two or more different regression models?

A

different estimation methods (different formulas to calculate the slope and intercept)

different populations, different samples, different variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What statistical instruments can be used to make a comparison?

A

total sum of squares

R^2

standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What equation do you use to work out total variation?

A

SST = SSR + SSE

Total Sum of Squares = Regression Sum of Squares + Error Sum of Squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does SST stand for?

A

Total Sum of Squares

18
Q

What does SSR stand for?

A

Regression Sum of Squares

19
Q

What does SSE stand for?

A

Error Sum of Squares

20
Q

How do you work out SST (Total Sum of Squares)?

A

SST = ∑(Yi - ȳ)^2

21
Q

How do you work out SSR (Regression Sum of Squares)?

A

SSR = ∑(Ŷi - ȳ)^2

22
Q

How do you work out SSE (Error Sum of Squares)?

A

SSE = ∑(Yi - Ŷi)^2

23
Q

What type of variation is SST (Total Sum of Squares)?

A

Total Variation

Measures the variation of the Yi values around their mean ȳ

24
Q

What type of variation is SSR (Regression Sum of Squares)?

A

Explained Variation

Variation attributable to the relationship between X and Y

25
Q

What type of variation is SSE (Error Sum of Squares)?

A

Unexplained Variation

Variation in Y attributable to factors other than X

26
Q

What is the coefficient of determination?

A

the portion of the total variation in the dependent variable that is explained by variation in the independent variable

27
Q

What is the coefficient of determination also known as?

A

R-square, denoted as R^2

28
Q

What is the equation for R^2 (the Coefficient of Determination)?

A

R^2 = SSR / SST = regression sum of squares / total sum of squares

–> R^2 = ∑(Ŷi - ȳ)^2 / ∑(Yi - ȳ)^2

29
Q

What does R^2 have to be between?

A

0 ≤ R^2 ≤ 1

30
Q

If R^2 = 1
describe the relationship between X and Y and the variation.

A

there is a perfect linear relationship between X and Y:
100% of the variation in Y is explained by variation in X

31
Q

If R^2 = 0
describe the relationship between X and Y and the variation

A

no linear relationship between X and Y:
none of the variation in Y is explained by variation in X

32
Q

If R^2 = 0.6
describe the relationship between X and Y and the variation.

A

Strong linear relationships between X and Y:
Most of the variation in Y is explained by variation in X

33
Q

If R^2 = 0.4
describe the relationship between X and Y and the variation

A

Weaker linear relationships between X and Y:
Some but not all of the variation in Y is explained by variation in X

34
Q

What is another way to work out R^2?

A

by working out the correlation coefficient (R) and then squaring it

35
Q

If R^2 = 0.576, how could this be expressed as a proportion or percent?

A

57.6 percent of the variation in the Y variable is explained by the variation in the X variable

36
Q

What is the equation for the Standard deviation of the variation of observations around the regression line?
(What does Syx = ?)

A

Syx = √(SSE / n-2) = √(∑(Yi - Ŷi)^2 / n-2)

where SSE = error sum of squares
n = sample size

37
Q

What are the steps for working out regression in excel?

A

1) select DATA from the Title bar Menu
2) click on DATA ANALYSIS button
3) select REGRESSION from the contextual menu
4) enter Y range and X range desired options
5) get coefficient values, intercept coefficient goes before X variable 1 coefficient

Ŷi (eg sells) = Coefficient Intercept + Coefficient X Variable 1 ( X variable) eg calls

38
Q

How do you get R^2 in excel?

A

1) select DATA from the Title bar Menu
2) click on DATA ANALYSIS button
3) select REGRESSION from the contextual menu
4) enter Y range and X range desired options
5) Look at the R square value in the Regression Statistics table
6) OR Look at ANOVA table, SS regression value = SSR and SS Total = SST
7) put values into equation R^2 = SSR / SST

39
Q

How do you get the value for Syx (standard error) in excel?

A

1) select DATA from the Title bar Menu
2) click on DATA ANALYSIS button
3) select REGRESSION from the contextual menu
4) enter Y range and X range desired options
5) Look at regression statistics table, standard error value = Syx

40
Q

How do you add the prediction line to the Plot of Fitted and observed data?

A

1) click on one of the observed values (Blue dots)

1) right click the mouse and select “Add Trendline” from the contextual menu