Lecture 7 - Correlation, Simple Linear Regression, R-Squared Flashcards
Correlation, ππ₯π¦
Correlation DOES NOT imply causation
Rescaled version of covariance (same sign), which lies in the interval [-1, 1]
If |ππ₯π¦| = 1, we say it has a perfectly linear relationship.
Question
The correlation between π₯ and π¦ is 1. This implies:
A) We have a positive relationship
B) We have a perfectly linear relationship
C) We have a strong relationship
D) All of the above
E) None of the above
B) We have a perfectly linear relationship
Question
The correlation between π₯ and π¦ is 0. This implies
A) We have no relationship
B) We have no linear relationship
C) x or y is zero
D) Both x and y are zero
E) None of the above
B) We have no linear relationship
Simple Linear Regression
Regression Line
- Referred to as βline of best fitβ or least-squares regression line
- Found by minimizing the sum of the squared vertical distances between each data point and the line (called residuals)
Uses of this line are to:
1. Identify associations
2. Make predictions
Simple Linear Regression
Regression Line Equation
Notes
Regression Line
R-Squared, π2
coefficient of determination
The squared correlation, π2 = ππ₯π¦2, is a statistic called the coefficient of determination
- Describes the fraction (percentage) of variability in the data which is explained by the regression model.
- If π2 is large (~80%), then itβs a good model and regression line will give solid predictions
ππ₯π¦
ππ
π2
Question
A) Determine whether the association is linear or non-linear
A simple linear regression assumes a linear relationship between the variables. It cannot determine whether the relationship is non-linear; for that, you would need to use other methods such as plotting the data or fitting non-linear models