Correlation and Regression Flashcards
Is there a relationship between the amount of time spent revising for an exam and exam performance?
Correlation
After controlling for exam anxiety, is there an association between revision time and exam performance?
Correlation
When seeing the terms relationships/associations/controlling, what general category of analysis would you choose?
Correlation and regression
When would you use a correlation over a regression considering they measure similar things?
Correlation would be used to assess a quick summary of the direction and strength of the relationship between two or more numeric variables
When you’re looking to PREDICT or optimise/explain a number response between the variables (how X influences Y) then you are looking at regression.
Regression = how one variable affects another
Correlation = the degree of relationship between two variables so strength and direction.
What does a -1 correlation tell us regarding the association between variables?
There is a perfect negative correlation. Therefore there is an association.
What does a +1 correlation tell us regarding the association between variables?
There is a perfect positive correlation.
What does a positive correlation mean?
As the FIRST (X) variable increases, the SECOND (Y) variable also increases.
What does a negative correlation mean?
As one variable increases, the second one decreases.
What do correlations measure?
The pattern of respones across variables
As you get cloer to a negative or positive correlation (true 1 or -1) does the association get weaker or stronger?
Stronger
What does an association of 0.0 indicate?
The null hypothesis aka no association.
How many tails can the alpha be?
Either one tailed or two tailed
How is the sample size and alpha/error rate important regarding whether a correlation is statistically significant or not?
Because
In a one tailed correlation what is the alpha level?
0.5. Testing an effect in one direction only
In a two tailed correlation, the alpha level is 0.25. Why?
Because it is testing the correlation in EITHER direction.
Which is more powerful - one tailed or two tailed correlation?
One tailed. More sure about the hypothesis - empirical evidence
Why would you use a two tailed correlation?
When uncertain about your hypothesis.
Sample size and alpha vale need to be considered when looking at correlation significance. What is true regarding assessing if a pearson R is significant, in terms of NUMBER OF DEGREES OF FREEDOM?
The size of the correlation (regardless of direction) must be MORE THAN the critical value given for that degree of freedom.
Would you expect to see a higher Pearson r value for a big n or a small n?
Higher r value for a small n because if few people in a study, a moderate correlation more likely to be due to chance as not many people compared to a desgn with a large sample
What does variance tell us?
A. How much scores deviate from the mean of the distribution
B. Variance is the average squared distance from the mean
C. Both
C
It is essentially the measure of how far away the data points are from the mean.
Why do we have the square the distance from the average of the distribution when it comes to variance?
Because the data points will be BOTH above and below the mean. If we average those points WITHOUT squaring them, they will cancel each other out (positive distance and negative distance from the mean = same thing, back to original score)
So why is the standard deviation (SD) the square root of the variance?
Because you have to square the data points before averaging them. Then SD is just square root after you have squared it first
How is the covariance of the two variables similar to the variance?
It tells us how much two variables differ from their means.
So instead of variance telling us how far data points are from mean for one variable, covariance shows how much TWO variables together differ from their means.
When dealing with covariance, why is it important to standardise it?
Because the units of measurement can lead to different outcomes with covariance equations.
How do we standardise the variables to deal with covariance problems such as measrements leading to different covariance outcomes?
We divide by the standard deviations of both variables. This is the correlation coefficient which is relatively unaffected by units of measurement
The standardised version of covariance is known as…?
The correlation coefficient (Pearson)
Covariance is unstandardised and correlation is standardised. Why?
Because the problem with covariance is units of measurement, because with raw scores covariance might be different. So you standardise to get around this, and standardising a variable means it is now a correlation .
ADvantageous as Doing this to continuous variables fixes many things. Putting things on the SAME scale when you standardise which makes it easier to compare the two or more variables. You might also hear it called SCALING. It makes your variances equal. Then you’re not looking at COVARIANCE anymore as you have made the variances equal
What is covariance of standardised variables essnetially?
A correlation. When you calculate this you get a correlation coefficient.
How do we standardise a variable?
By subtracting the mean and dividing by the SD
If we already have the co variance of X and Y, what do we need to do to standardise them?
Divide by the SD of X and Y
What does dividing the covariance by the SD do in terms of the RANGE of the correlation coefficient?
We force it to be between -1 and +1, about how staright line fits the data. So correlation coefficients can’t be less than or more than +1. BUT, covariance can be anything!
True or valse: correlations AND covariance must be between -1 and +1
False - covariance can be anything
Why is keeping correlations between -1 and +1 advtangeous?
For comparisons
What does a pearson correlation do regarding what it measures with variables?
It measures the DIRECTIOn and DEGREE of linear relationship between two interval/ratio variables. The + or - denoted the direction of the relationship, so whether positive or negative.
Can both covariance and correlation tell us about the direction of any linear relationship?
Yes
What is the diff between correlation and covariance regarding what it shows with linear relationships?
Correlation shows not only direction of linear relationship but STRENGTH. Covariance can’t.
What type of data is required for covariance/correlation?
Continuous (interval or ratio)
What is more important when looking at a correlation coefficient: whether the data points fit the line better or if positive/negative association?
If data fits the line better. Because then this tells us there is a strong relatinship, so a change in one variable is associated with a chane in another variable (not about what caused it) .
Why would we use a correlation matrix?
Because if we have a dataset with many variables you would have coefficients between each combination of those variables.
Correlation of each pairwise combination of variables in whole dataset.