W7 - Regression Flashcards
What is regression analysis used for?
To predict one variable based upon the score in the other variable.
What is regression concerned with?
Prediction of 1 variable from a RELATED variable
In what 3 ways does regression + correlation analysis differ?
In its purpose
How variables are described
The inferential tests
3 ways in which regression + correlation analysis differ
What is meant by in its purpose
When talking about correlations = talk about relationships, associations + correlations.
Regression analysis = Clearly talking about prediction.
3 ways in which regression + correlation analysis differ
What is meant by how variables are described
For correlation analysis = makes no difference which variable is on X or Y axis of scatterplot.
Regression analysis = Independent always on X axis, dependent variable always on Y axis.
3 ways in which regression + correlation analysis differ
What is meant by the inferential tests
Correlation analysis = Primarily interested in the r value.
Regression analysis = Interested in 3 bits of info;
- r^2 value (shared variance between 2 variables)
- Intercept (a)
- Regression coefficient (b)
What does the regression line on a graph do?
Minimises the vertical deviations of the points from the line
What does it result in when the vertical distance between the line + points on a scatter plot are minimised?
We are minimising the error of prediction
For a scatter plot with regression analysis, what is also known as the dependent variable on the y axis?
Predicted variable
For a scatter plot with regression analysis, what is also known as the independent variable on the x axis?
Predictor variable
What is the bivariate regression equation?
Algebraic equation expressing the prediction of 1 variable by another
How is the bivariate regression equation usually written?
Y = a + bX
Also written as:
Y = bX + c
^^ Where C=a
BIVARIATE REGRESSION EQUATION
What does the Y represent?
Dependent variable
BIVARIATE REGRESSION EQUATION
What does the a represent?
Constant (Y intercept)
Where the line of best fit would cross through the y axis
BIVARIATE REGRESSION EQUATION
What does the X represent?
Independent variable
BIVARIATE REGRESSION EQUATION
What does the b represent?
Regression coefficient (slope)
= Change in y / change in x
Steps to perform regression analysis
1+2. Consider Null + Alternative hypothesis |(make sure to include WILL or will NOT predict)
- Select level of significance
- Collect + summarise data
- Check assumptions
- Run Statistical test
- Interpret significance of result
Steps to perform regression analysis
Give a null hypothesis example for:
Sum of skin fold measures + body fat %
There’s no significant relationship between the variables.
More specifically, the sum of skin folds will NOT predict % body fat.
Steps to perform regression analysis
Give a alternative hypothesis example for:
Sum of skin fold measures + body fat %
There is a significant relationship between the variables.
Most specifically, the sum of 5 skin folds will predict % body fat.
Steps to perform regression analysis
- Select level of significance
Give an example using:
Sum of skin fold measures + body fat %
If our probability level is less than 0.05 (p<0.05), we are 95% confident that explained variability in DXA % body fat by sum of 5 skin folds is greater than one might expect by chance if there was no explained variability in the population.
Steps to perform regression analysis
- Assumptions for bivariate regression
Need to ensure data is parametric
What comes under data being parametric?
Normal distribution
Homogeneity of variance
Interval/ratio (continuous)
Independence
Linearity
Residual values are normally distributed
What is the vertical distance between the data points + the line in scatter plots also known as?
Residual distance
Does each point on a scatter plot have a vertical/residual distance?
YES
What does R represent
Simple Pearson correlation coefficient
What does R^2 represent
Coefficient of determination
What must you do when running a statistical test for regression analysis?
Find:
- R^2 value
- a value
- b value
In SPSS what can be found in the analysis of variance box in a table?
Titles:
- F
- Sig.
Where can the a + b values be found in the SPSS output?
Under the unstandardised coefficients in the coefficients box
In column B
What in the coefficients box in SPSS outputs, tells us whether the b-value (slope) is significantly different from 0?
The t statistic + sig. value columns at the end.
List ways in which regression analysis can be used in Exercise + sport sciences
Predicting skill perf from self-efficacy
Predicting obesity risk from daily PA levels
Way in which regression analysis can be used in the real world
Pre-London 2012, regression tech were used to predict no. of GB medals
What is the SD of the residual/vertical distances known as?
Standard Error of the estimate (SEE)
How is a prediction interval created using SEE?
z-score (i.e 1.96) x SEE
What is a regression equation used for?
To estimate the value of the DV based on the value of the IV
In bivariate regression, if the slope of the regression line was 2.1, this would mean….
That for every increase of 1 on the X axis there is an increase of 2.1 on the Y axis
What does variance of a single variable represent?
The avg amount that the data may vary from the mean
How do you calculate the exact similarity between the patterns of differences of the 2 variables in the single-variable case?
Square the deviations
- to eliminate the problem of +ive + -ive deviations cancelling each other out.
How do you calculate the exact similarity between the patterns of differences of the 2 variables in the 2-variable case?
Multiply the deviation for 1 variable by the corresponding deviation for the 2nd variable.
To get the cross-product deviations.
What is covariance?
The avg sum of combined deviations
What does a +ive covariance indicate?
That as 1 variable deviates from the mean, the other variable deviates in the same direction.
What does a -ive covariance indicate?
That as 1 variable deviates from the mean, the other deviates from the mean in the opposite direction.
Between what values must the correlation coefficient lie between?
-1 + 1
What is the spearman’s correlation coefficient?
Non-parametric statistic
Useful to minimise the effects of extreme scores or the effects of violations of the assumptions.
In regression analysis, what assumptions do we need to check to see if the data is parametric?
Normality
Linearity
Homogeneity of variance
Independence
Interval/ratio data
Normally distributed residuals
Which assumptions can be examined using a scatterplot graph?
Linear relationship
Homogeneity of variance