Simple Linear Regression Flashcards
used to show or predict the relationship (not cause-and-effect) between two variables
Linear regression
predicted dependent variable
factor (Y)
the independent variable used to predict the value of the Y
factor (X)
finds the straight line, called the least-squares regression line
Linear Regression Equation
Y = a + bX or Y = Β0 + Β1X
i. Y – dependent variable (goes on the Y axis)
ii. X – independent variable (plotted on the X axis)
iii. a – or B0 (regression constant) or y intercept. Eq. a = 𝑦̅ - b𝑥̅
iv. b – or B1 (regression coefficient) or slope. Eq. b = r (SY/SX)
Assumptions for Linear Regression
a. Both the independent and dependent variable should be in scale.
b. There needs to be a linear relationship between the two variables
c. The dataset should be normally distributed with no significant outliers.
d. The observations must be independent
e. Data needs to show homoscedasticity
The variances along the line of best fit remain similar as you move along the line. Residual errors are the same across all values of the predictor (or normally distributed)
Homoscedasticity
the variances are not the same across all values of the predictor.
Heteroscedasticity
refers to the proportion of the variance in the dependent variable that is predictable from the independent variable
coefficient of determination (R2)
An R2 of 0
dependent variable can’t be predicted from the independent variable dependent variable can’t be predicted from the independent variable
R2 of 1
dependent variable can be predicted without error from the independent variable.
R2 between 0 and 1
indicates the extent to which the dependent variable is predictable