W7 Regression Analysis Flashcards
What is Regression Analysis?
- Regression is concerned with predicting one variable from another related variable
What are three ways regression is different to correlation?
- The purpose
- How we describe our variables
- The inferential tests
a. R2 (squared) value
b. Regression coefficient (b value)
c. Intercept (a value)
What is the point of the regression line (line of best fit)?
- The regression line minimises the vertical deviations of the points from the line.
- Basically by minimising the vertical distance the points are from the line minimises the error of prediction
What is the Bivariate Regression Equation?
- The bivariate regression equation uses two variables to predict each other
- Y= a + bX
Y = dependent variable
a = constant (Y intercept) - occasionally represented as ‘c’
b = regression coefficient (slope)
X = independent variable
How do you find the ‘a’ value (Y intercept) and the ‘b’ value (the slope of the line)?
- To find ‘a’ value look along Y axis (vertical) and see where the line intercepts it
- To calculate ‘b’ value you do: Change in Y divided by Change in X or (Y-Y1) / (X-X1)
How is variability checked within regression analysis?
- You use R2 (squared)
- To check the variance between the dependent variable and independent variable use R2 (squared)
- Square the correlation coefficient
E.g. R = 0.62
R2 (squared) = 0.38, so the dependent variables explains 38% of the variance in the independent variable
What are the 7 steps to perform a regression analysis?
- Null hypothesis (key thing with 1 & 2 is they are talking about predicting)
- Alternative hypothesis
- Select a level of significance (sig. is set at p<0.05)
- Collect & Summarise data
- Make sure to check all 6 assumptions
- Run statistical test
- Interpret significance of result
Assumptions Check for Bivariate regression (point 5/7 steps to performing regression analysis):
- This is how we check our data to make sure it fits the crate for…
- What are the 6 assumptions to check for?
- …parametric & is normally distributed
1. Normal distribution
2. Homogeneity of variance
3. Interval/ratio
4. Independence
5. Linear relationship
6. Residuals are normally distributed (SPSS does this to make sure the bars in a bar chart fit roughly within a bell shaped curve)
What does the R value on SPSS represent?
R value = simple correlation value (strong/positive)
R value = Pearsons correlation coefficient
What does the R2 (squared) value on SPSS represent?
- R2 (squared) value = coefficient of determination OR the amount of variance explained.
- So if R2 (squared) was 0.81 that means the the variance that the independent variable was from the dependent variable can be explained as 81% (lots)
What does the Sig. value on SPSS represent?
- The Sig. value represents if the data is statistically significance or not.
- Sig. is also shown as P value
- For data to be significant it must be below 0.05 (P<0.05)
Describe where you can find the ‘a’ & ‘b’ values on an SPSS data sheet.
- ‘a’ & ‘b’ are both in the box labelled ‘B’
- The one by the ‘(Constant) = ‘a’ (Y intercept)
- The one by the ‘Sum.5skinfold’ = ‘b’ (Slope of regression line or steepness of the line of best fit)
What does the ‘t’ statistic show you?
- The ‘t’ statistic shows you if the line of best fit is statistically significant from zero. So to see if there is a positive or negative correlation.
- You have to compare the ‘t’ statistic to the Sig. value