W2 Regression Analysis Single Flashcards
What is the difference between correlation and causation
Correlation between two factors may just be random/coincidence
Causation is correlation with a cause
How is causation established
Logic and theory
What is regression analysis
Questions if independent variables impact dependent variables
What is the dependent variable Y
The variable we wish to predict (explain)
What is the independent variable X
The variable used to predict (explain the dependent)
What is simple linear regression
1 dependent variable
Linear relationship between X and Y
Changes in Y are related to changes in X
Equation for simple linear regression
Y = B0 + B1X1 + e
What is the B0 in the regression formula
Y intercept
What is the B1 in the regression formula
The slope of the coefficient. Ie for each change in coefficient(independent variable) y will change by the slope
Example of a positive linear relationship
Number of customers signed up to the emailing list and number of total sales
Example of negative linear relationship
Demand curve
Example of positive curvilinear relationship
Age and maintenance costs of a washing machine, it rises fast but eventually plateaus
Example of a u shaped relationship
Entrepreneurial activity and GDP per capita
Entrepreneurial activity occurs must in high end and low end GDP countries
Example of exponential relationship
Value of car and its age
Interpret this equation for grades (dependent) and absences (independent)
Y = 85 - 5X
Y intercept is 85 meaning if a student has no absences, they’re grade should be 85%
The slope of the independent variable is -5 meaning that With each absence, their grade is predicted to fall by 5%
Using the regression formula, how can you predict outcomes for specific values of the independent variable,s
Submitting them into the formula
What does the coefficient of determination show us
How good is the regression
The proportion of the variable that is explained by variation in the independent variable
What is the coefficient of determination also known as
R^2
If r squared is closer to 1 does this mean it’s stringeror weaker
Stronger
What are the assumptions of regression
Linear
Independent error values
Normally distributed error values
Equal variance
How to check errors (residual analysis)
E = abs(predictedY - actualY)
What is autocorrelation
Exists if residuals in one time period are related to residuals in another time period
in regression how do you check normality assumption
the normal probability plot should be approximately linear
what would show potential violation of assu,ptions om the equal variances plot
fan shape
when is the independence plot important
in data with time
what does a fan shaped residual mean
potential violation of equal variances assumption