W6 Correlation Analysis Flashcards
How can we use correlation analysis and what does it measure?
- We can use it to examine the relationship between any two things or any two variables to see the extent to which they are related.
- It measures strength and direction of data
- What is negative correlation?
- What is positive correlation?
- What is little systematic tendency?
- Positive correlation = bottom left to top right of the graph
- Negative correlation = top left to bottom right of the graph
- This is 0 or no relationship between the data and is a horizontal line.
If there are lots of data points on a graph clumped together around the line of best fit what does that tell you about the strength of the data?
- The more data that is clumped together around the line of best fit the stronger the data is
What is the definition of correlation coefficient?
- A numerical value that indicates the extent to which two variables are related.
- Thus, it is a numerical summary of a bivariate (two variables) relationship
Give the 7 steps to correlation coefficient:
1 & 2: Consider the Null (H0) & Alternative (H1) Hypothesis for any relationship.
3: Level of significance (P-value/Pearsons coefficient)
4: Collect & Summarise data
5: Assumptions to be carried out before attempting correlation coefficient
6: Run statistical test
7: Interpret significance of results
What are the 5 assumptions that need to be tested before attempting a correlation coefficient?
(if our data meets any of these assumptions then it allows us to do a correlation coefficient)
To make sure the data is parametric:
- Normal distribution
- Homogeneity of variance
- Interval/ratio
- Independence
- Linear relationship
What is the parametric test that we will be using?
- The parametric test we will be using is called a Pearson’s Correlation test
Assumptions:
What is normal distribution?
- We are looking to see if our data is normal.
- Does it fit between 1.0 & -1.0 which relates to skewness and kurtosis
Assumptions:
Can you describe what Independence is?
- This is checking participants aren’t influencing other participants’ data
Assumptions:
What is meant by Interval/ratio?
- You have to make sure that the data is either interval or ratio data
Interval = Equal units or intervals between data points on a scale but there is no absolute zero point E.g. Body temperature
Ratio = Equal units of measurements and has an absolute zero point E.g. Speed
Assumptions:
Can you describe what homogeneity of variance is?
- This tests the variance of one variables stability compared to to the other at all levels
- The way to see if data has good homogeneity of variance is to ask “Does the data follow/stay close to the line of best fit?”
- Homogeneity = The same
- Heterogeneity = Different
Assumptions:
What is Linear relationship?
- You can check this by plotting all the data on a scatter plot and try to see any obvious curves in the relationship or straight lines
Interpreting SPSS:
- If you are looking for the number of participants what is that represented as in SPSS?
- If you are looking for the Correlation Coefficient how will that be represented in SPSS?
- Significant Value/ P value is represented as what in SPSS?
- ‘N’ represents the number of participants
- Correlation coefficient is represented as ‘Pearson’s Correlation’ or ‘r’ in SPSS
- Significant value/P value is represented as ‘Sig.’ in SPSS
Interpreting SPSS: - SPSS provides you with the following data. Write a small paragraph to describe the data you are provided with. 'Pearson Correlation = .659' 'Sig. (2-tailed) = .000' 'N = 25'
Looking at the following data I would interpret a highly significant positive linear relationship between both variables (r=.66 p<0.001)