Simple Linear Regression Flashcards
Regression analysis
Regression analysis is used to:
predict the value of a dependent variable (Y) based on the value of at least one independent variable (X)
explain the impact of changes in an independent variable on the dependent variable
Dependent variable (y)
Dependent variable (Y): the variable we wish to predict or explain (response variable)
Independent variable (x)
Independent variable (X): the variable used to explain the dependent variable (explanatory variable)
Simple linear regression
Only one independent variable, X
Relationship between X and Y is described by a linear function
Changes in Y are assumed to be caused by changes in X
b0 and b1
b0 and b1 are obtained by finding the values of b0 and b1 that minimise the sum of the squared differences between actual values (Y) and predicted values ( )
b0
b0 is the estimated average value of Y when the value of X is zero
b1
b1 is the estimated change in the average value of Y as a result of a one-unit change in X
SST
Total Sum of Squares
Measures the variation of the Yi values around their mean Y
SSR
Regression Sum of Squares
Explained variation attributable to the relationship between X and Y
SSE
Error Sum of Squares
Variation attributable to factors other than
the relationship between X and Y
Coefficient of Determination, r2
The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable
The coefficient of determination is also called r-squared and is denoted as r2
ASSUMPTIONS OF REGRESSION
Linearity
Independence of errors
Normality of errors
Equal variance
Linearity
The underlying relationship between X and Y is linear
Independence of errors
Error values are statistically independent
Normality of errors
Error values (ε) are normally distributed for any given value of X
Equal variance
The probability distribution of the errors has constant variance
residual for observation
The residual for observation i, ei, is the difference between its observed and predicted value
Check the assumptions of regression by examining the residuals:
Examine for linearity assumption
Evaluate independence assumption
Evaluate normal distribution assumption
Examine for constant variance for all levels of X (homoscedasticity)
Graphical Analysis of Residuals
Can plot residuals vs. X
Pitfalls of regression analysis
Lacking an awareness of the assumptions underlying least-squares regression
Not knowing how to evaluate the assumptions
Not knowing the alternatives to least-squares regression if a particular assumption is violated
Using a regression model without knowledge of the subject matter
Extrapolating outside the relevant range
Concluding that a significant relationship in observational study is due to a cause and effect relationship
Types of relationships
1-2
Equation
3
Equation explained
4
Sample equation and least squares method
5-6
Example 1
7-12
Interpolation v extrapolation
13
Measures of variation
14-15
rsquared
16-19
Standard error
20-22
Residual analysis
23-28
Slope inferences
29-31
T test
32-34
F test
35-37
Confidence interval
38