Practical 6: Correlation & Regression Flashcards
When is correlation conducted?
When we have two continuous variables and we need to assess their relationship
- graphically represented using a scatterplot
How is a scatter plot demonstrated in SPSS
Graph –> legacy dialogs –> scatter/dot –> define groups and label cases by ID
Labelling cases by ID ensures we can identify any outlying variables
To put a line of best fit –> choose at line of best fit –> select linear
What is correlation?
A measure of statistical association and assess the strength of a linear relationship - can demonstrate the direction and magnitude of the relationship
Define the strength of correlations?
- 0 - 0.20 –> very weak
- 2 - 0.39 –> weak
- 4 - 0.59 –> moderate
- 5 - 0.70 –> strong
- 80 - 1.00 –> very strong
What are the assumptions for Pearson’s?
Normally distributed
Each participant has a pair of values
There is a linear relationship between the values
No outliers
Observations are randomly and independently drawn
How do we select pearson’s
Correlate –> bivariate –> pearson’s
Choose - Spearman’s correlation if not normally distributed
When is Spearman’s correlation used?
Non-normally distributed data (or ordinal data - assume it is skewed)
Monotonic relationship between two values - i.e when one increases or decreases so does the other but not the same/constant rate
Random independent observations
Both have a pair of values
(non-parametric version of pearson’s correlation)
What are the components of a simple linear regression?
y = B0 + B1X + e
x - predictor, independent variable (continuous or categorical)
y = dependent variable, outcome –> always continuous
B0 –>intercept value that y takes when x is 0
B1 is the slope –> determines the change in y when x changes by one unit
e –> residual represents the distance between points on the y
What is the ordinal least sqaures method?
Method of making the residuals as small as possible
Line of best fit represents the one in which the sum of the residuals is the smallest. Worked out by squaring each residual and adding them together
What are the assumptions for a simple linear regression?
There is a linear relationship between the dependent and independent variables
Residuals are independent of another
Residuals follow a normal distribution with mean 0 and constant SD
Homogeneity of variance - i.e the size of the error doesn’t significantly alter for different values of the independent variable
How is a simple linear regression ran in SPSS?
Analyse –> regression –> linear
On statistics choose estimates oriented and confidence intervals
What does an R square refer to on a simple linear regression printed out?
How much variance in the independent variables accounts for the the variance in the dependent variable
What does the ANOVA table refer to on simple linear regression?
Whether the SLR models explains the data significantly
How is a SLR reported?
Significant relationship was found between X and Y with a 1cm increase in X associated with a B1 increase in Y ( B1 = , t =, p =, 95% CI =)
In SPSS how is a prediction model made for a simple linear regression?
Type in new height (X value)
Click analyse –> regression –> simple linear –> define groups –> in save select under-standardised value, 95% confidence individual
Select mean if want to estimate y of the population
Select mean if want to estimate y for an individual