Week 1- lecture notes (correlations) Flashcards
What is the definition of a correlation?
the extent to which 2 variables are linearly related- as X increases Y increases
What is a scatterplot
how we visually assess the relationship between 2 variables
What does a scatterplot in terms of strength and relationship
Look at STRENGTH of the relationship (closeness of points to line of best fit) and DIRECTION of the relationship (positive, negative, null)
What do scatterplots allow us to do?
- Familiarise ourselves with the data
- Identify the distribution of data and any initial relationships
- Identify any outliers
What is the Pearson’s Product Moment Correlation Coefficient (R)
- quantifies the linear correlation between 2 variables ranging +1 to -1
-e.g. where along the line +1 to -1 does the correlation fall
-Looks at STRENGTH (ignoring -/+) of the relationship and DIRECTION of the relationship (positive, negative, null)
positive correlation
- Perfect positive correlation, r=1
- When X increases, Y increases
negative correlation
- Perfect negative, r=-1
- When X increases, Y decreases
- E.g. when seminar absence increases, WBA scores decrease
what is a weak, moderate and strong correlation of the Pearson Product Moment Correlation Coefficient
- Small/weak- r>0.1
- Medium/moderate- r> 0.3
- Large/strong-r>0.5
What is a covariance
The extent to which 2 variables vary together
-linked to variance (instead of just x it’s x and y)
-this is where we use the formula
-from this covariance we can calculate pearson’s R by sticking it into the formula
Why should we do Pearsons R instead of covariance to calculate correlation
-Correlation coefficients describe the strength and direction of an association between variables. A Pearson correlation however is a measure of a linear association between 2 normally distributed random variables.
-Size of covariance is affected by size of variances of 2 separate variables which can make comparisons difficult
- The correlation formula improved this by replacing N with SD
-formula for this, will range from -1 to 1
What is the coefficient of determination
Coefficient of determination tells us the proportion of variance in one variable that can be accounted for by the other variable.
-this is established by squaring r and as r is below 1, the squared value will always be less
-derived from the correlation coefficient
How do you establish the coefficient of determination
square R
Summary- key things to remember!
-Correlation measures the relationship between two numerical or continuous variables.
- A scatterplot is useful to construct before the correlation analysis to interpret the relationship and assumptions (more of this next week).
-Pearson’s correlation coefficient gives us information on the strength and direction of the relationship.
-The significance of the correlation is partly dependent on the sample size.
-The coefficient of determination tells us the proportion of variance that can be accounted for by the other variable.
-Do not confuse correlation with causation and think! What else could be influencing the correlation?
How do we know if a correlation is significant
It needs to be bigger than the critical value
-can check if correlation coefficient is significant by comparing to a table if critical values
quiz question- how does the covariance formula differ from the variance formula?
Variance multiplies variable scores by itself whereas covariance multiplies these with another variable