1.7 linear regression Flashcards

1
Q

the dependent variable or the explained variable

A

refers to the variable whose variation is being explained. It is typically denoted by Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the independent variable or the explanatory variable

A

refers to the variable whose variation is being used to explain the variation of the dependent variable

Denoted by X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If there is only one independent variable, then the regression is known as

A

Simple Linear Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If there are two or more independent variables, the regression is known as

A

multiple rergression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The four assumptions underlying the simple linear regression model are:

A
  • Linearity
  • Homoskedasticity
  • Independence
  • Normality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Linearity

A

The relationship between the dependent variable and the independent variable is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Homoskedasticity

A

The variance of the residuals is constant for all observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Independence

A

The pairs (X, Y) are independent of each other. This implies the residuals are uncorrelated across observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Before running a regression model, an analyst states two of the underlying assumptions for linear regression analysis:

Assumption 1: The variance of the error term has an expected value of zero
Assumption 2: The independent variable is not random
What is the most accurate assessment of the analyst’s description of the assumptions that must be satisfied to draw valid conclusions from a simple linear regression model?

A
Assumption 1 and Assumption 2 are both correct

B
Assumption 1 is correct and Assumption 2 is incorrect

C
Assumption 1 is incorrect and Assumption 2 is correct

A

C
Assumption 1 is incorrect and Assumption 2 is correct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The sum of squares total (SST), which is the total variation in Y, can be broken into two components:

A
  1. Sum of squares error (SSE), which is the unexplained variation in Y
  2. Sum of squares regression (SSR), which is the explained variation in Y
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Measures of Goodness of Fit

A

Measures to evaluate how well the regression model fits the data include:

  • The coefficient of determination
  • The F-statistic for the test of fit
  • The standard error of the regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The coefficient of determination (a.k.a. R-squared or R^2)

A

measures the fraction of the total variation in the dependent variable that is explained by the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

F-Statistic

A

To evaluate if our regression model is statistically meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The objective of linear regression

A

to understand what explains the variation of Y, also known as the sum of squares total (SST), or the total sum of squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how to find the variation of Y, also known as the sum of squares total (SST)

A

E(Yi - Y_)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to find the variation of X

A

E(Xi - X_)^2

17
Q

The regression equation

A

expresses the linear relationship between X and Y

Y = b0 + b1 * X + e

b0 = the intercept

b1 = the slope coefficient of the regression line.

e = the error them, which represents the difference between the observed value of Y and its expected value from the true underlying population.

–> the difference between the expected value of Y and the value of Y in the underlying population

18
Q

The line plotted by the regression equation represents

A

he average relationship between the dependent variable and the independent variable

19
Q

The difference between the observed and estimated values of the dependent variable

A

the residual

Observed value of Y - estimated value of Y

20
Q

coefficient of determination (R^2) formula

A

SSR / SST

21
Q

In a simple linear regression with only one independent variable, the coefficient of determination is equal to

A

the square of the correlation between X and Y.

R^2 is a descriptive measure, not a statistical test.

22
Q

the F test statistic for a simple linear regression model is:

A

F = (SSR/1) / (SSE(n-2))

23
Q

The standard error of the estimate (se)

A

the square root of MSE

se = sqrt(MSE)

also known as the standard error of the regression or the root mean square error.

A smaller value indicates a more accurate regression.

24
Q

The estimated variance of a regression model’s prediction error is determined by several factors:

A

The squared standard error of the estimate, S^2e

The number of observations, n

The value of the independent variable X relative to X_

The variance of the independent variable, S^2x