Week 6: Linear Regression 1 Flashcards
What is the difference between the independent and dependent variable?
Dependent variable = outcome, response variable
Independent variable = Control, explanatory, input variable
What does correlation measure?
Measures the degree of linear association between 2 variables
What are the 3 steps to describe how a regression works?
1) Develop an estimating equation (y = f(x)), and learn the pattern of the relationship
2) Determine the degree to which the variables are related
3) Examine how well the estimating equation describes the relationship
What is the standard estimating equation?
y = mx + c
How to determine best fit line?
SSR, similar to SSE
Why do we square the errors?
1) To magnify larger errors
2) To cancel the effect of positive and negative values
How do you calculate SST and SSR?
SST = (y_actual - y_mean)^2
SSR = (y_hat - y_mean)^2
What is R^2 and how do you calculate it
Coefficient of determination. Calculated by taking SSR/SST
What are some possible ways to assess relationships?
- Graphical visualizations like scatter plots
- Correlation coefficient
- Linear regression
What are the types of relationships between 2 variables?
1) Strong, positive/negative linear
2) positive/negative linear
3) Perfect positive/negative linear
4) Parabolic
5) Curvilinear (Exponential)
6) No relationship
How do you calculate the total variation of error?
Explained + Unexplained variation
What is R^2?
a statistical measure that represents the proportion of variance in the dependent variable that can be explained by the independent variable(s) in a regression model. In other words, R-squared is a measure of how well the regression line fits the data