1.7 linear regression Flashcards
the dependent variable or the explained variable
refers to the variable whose variation is being explained. It is typically denoted by Y
the independent variable or the explanatory variable
refers to the variable whose variation is being used to explain the variation of the dependent variable
Denoted by X
If there is only one independent variable, then the regression is known as
Simple Linear Regression
If there are two or more independent variables, the regression is known as
multiple rergression
The four assumptions underlying the simple linear regression model are:
- Linearity
- Homoskedasticity
- Independence
- Normality
Linearity
The relationship between the dependent variable and the independent variable is linear
Homoskedasticity
The variance of the residuals is constant for all observations
Independence
The pairs (X, Y) are independent of each other. This implies the residuals are uncorrelated across observations
Before running a regression model, an analyst states two of the underlying assumptions for linear regression analysis:
Assumption 1: The variance of the error term has an expected value of zero
Assumption 2: The independent variable is not random
What is the most accurate assessment of the analyst’s description of the assumptions that must be satisfied to draw valid conclusions from a simple linear regression model?
A
Assumption 1 and Assumption 2 are both correct
B
Assumption 1 is correct and Assumption 2 is incorrect
C
Assumption 1 is incorrect and Assumption 2 is correct
C
Assumption 1 is incorrect and Assumption 2 is correct
The sum of squares total (SST), which is the total variation in Y, can be broken into two components:
- Sum of squares error (SSE), which is the unexplained variation in Y
- Sum of squares regression (SSR), which is the explained variation in Y
Measures of Goodness of Fit
Measures to evaluate how well the regression model fits the data include:
- The coefficient of determination
- The F-statistic for the test of fit
- The standard error of the regression
The coefficient of determination (a.k.a. R-squared or R^2)
measures the fraction of the total variation in the dependent variable that is explained by the independent variable
F-Statistic
To evaluate if our regression model is statistically meaningful
The objective of linear regression
to understand what explains the variation of Y, also known as the sum of squares total (SST), or the total sum of squares
how to find the variation of Y, also known as the sum of squares total (SST)
E(Yi - Y_)^2