Stats - correlation and regression Flashcards
What is correlation used for?
Correlation is used to test for association between variables (e.g. whether salary and IQ are related).
How is regression used, and how does it relate to correlation?
Once correlation between two variables has been shown, regression can be used to predict values of other dependent variables from independent variables. Regression is not used unless two variables have firstly been shown to correlate.
What are the 3 basic categories of correlation?
Linear
Non-linear
No correlation
Variables that are correlated through a linear relationship can display either positive or negative correlation.
What is the difference between these two?
Positively correlated variables vary directly (as one increases so does the other).
Negatively correlated variables vary as opposites (as the value of one variable increases the other decreases).
How do you measure the strength of correlation?
The strength of the association can be estimated by observing a scatter graph of the variables. The correlation type is independent of the strength.
It can be strong/ moderate/ weak.
How do you measure the strength of a linear relationship?
What symbols are given to the sample and the population correlation coefficients?
Correlation coefficient (Pearson’s correlation coefficient).
The sample correlation coefficient is given the symbol r.
The population correlation coefficient has the symbol ρ (rho).
The sign of the correlation coefficient tells us the direction of the linear relationship. How do positive and negative correlations appear?
If r is negative (<0) the correlation is negative and the trend line slopes down. If r is positive (> 0) the correlation is positive and the trend line slopes up.
The size (magnitude) of the correlation coefficient tells us the strength of a linear relationship.
What value does r have in a:
1) very strong linear association
r = 0.8-1
The size (magnitude) of the correlation coefficient tells us the strength of a linear relationship.
What value does r have in a:
2) strong correlation
0.6-0.79
The size (magnitude) of the correlation coefficient tells us the strength of a linear relationship.
What value does r have in a:
3) moderate correlation
0.4-0.59
The size (magnitude) of the correlation coefficient tells us the strength of a linear relationship.
What value does r have in a:
4) weak correlation
0.2-0.39
The size (magnitude) of the correlation coefficient tells us the strength of a linear relationship.
What value does r have in a:
5) very weak linear association
0-0.19
Parametric statistic procedures rely on assumptions about the shape of the distribution.
What 3 characteristics do parametric data assume?
1) normal distribution
2) measured on an interval/ ratio scale
3) conditions or groups have equal variance
How is a complete absence of correlation expressed?
0
How do we summarise correlation using:
a) parametric variables
b) non-parametric variables
What are the symbols for:
c) the samples
d) the population
a) Pearson’s
b) Spearman’s rank
c) parametric - r, non-parametric - rs
d) p (for both)