Quan - Regression Flashcards
Regression =
use one variable to predict the other
- use to investigate casual relationship between 2 interval variables
- find the effect of the IV on a DV
Regression analysis needs what variables?
2 interval level variables
normally distributed - histogram
share a linear relationship - scatterplot
What visual aid is used in regression?
Scatterplot
Why does regression use scatterplots?
Line of best fit
- calculated by measuring the distance between all the data
points and every possible line that can be drawn through them
- By choosing the line with the lowest total distance (i.e. the lowest error) we can use this for the purposes of prediction
REGRESSION is the process of
The line of best fit is drawn through the data points and summarizes the relationship between two variables
Line of best fit important becuase
line of best fit is used to understand how changes in the
independent variable causes changes in the dependent variable
can predict the values of the dependent variable from the values of the independent variable. This is known as regression analysis
Regression is particularly sensitive to outliers
They influence significantly the slope of the regression line
We can easily visualise outliers with scatterplots
Do not delete outliers without justifying theoretically the significance
of their presence
R2
(coefficient of determination)
= measure of ‘model fit’
- Pearson’s correlation coefficient squared
- R2 tells us how much of the variance in the dependent
variable is explained by the independent variable