Correlations Flashcards
What is the best way to look at residuals?
PP plot
Linear function
Same variable but diffrent units of measurement —> slopes become arbitrary
Would have complete shared variance
What are the differences between partial and semi-partial correlations?
Semi-partial - used to examine the additional predictive value of a predictor, residualises one variable
Partial - used to statistically control other predictors, residualises both variables
When do you use multiple regression?
If you want to predict a response variable using many predictors
Can also determine if we have more than one predictor
When do you use semi-partial correlation?
If you want to determine how much benefit a predictor gives you on top of several other predictors
When do you use partial correlation?
If you want to examine the strength of a relationship between variables while holding other variables constant
What would you expect if you correlate a z score and percentile of a variable?
Not complete correlation but very close
Perfect Kendall’s Tau
Very highly correlated = collinearity (not linear function though)
Radically changes p value, standard error etc - cannot identify a unique effect of z score
What is multi-collinearity?
None of the predictors are correlated with the variable but are highly correlated with each other
What is VIF?
Variable inflation factor - collinearity diagnostic
Increases with correlation
>9 is considered problematic (3 when square root)
Regression ignoring dv
In what circumstances can’t you have a linear relationship?
Between a predictor and a discrete DV
What is correlation?
It’s all about prediction - if there is a relationship between two variables we can use x to estimate y
How can we characterise a relationship?
Strength - how well one variable can predict another
Form - what is the shape of the variable
Direction (if form is monotone) - is the direction positive or negative
What is the criteria for strength?
There is none - it’s a subjective idea
What is Kendall’s Tau?
A non parametric correlation test
Used when data set is small with large number of tied ranks
How is Kendall’s Tau useful?
Can draw more accurate generalisations with Kendall’s Tau than Spearman’s
Helps us understand strength and direction of monotone relationships
Resistant to outliers
Tb used to solve the problem of tied ranks
How are movements between points characterised?
Consistent - as you go up in x you go up in y (positive)
Inconsistent - as you go up in x you go down in y (negative)
How do you calculate Tau?
- Calculate the proportion of consistent movements (con/total)
- T = (2 X proportion of consistent movements) - 1
What makes Kendall’s Tau non-parametric?
Slope and intercept aren’t needed so it doesn’t assume parametric from for the relationship
What is standardisation?
Convert into standard set of units (SDs) to overcome dependence on measurement scale problem
Pearson’s correlation
Coefficient = r —> ranges between -1 and 1 (0 = no relationship)
For linear relationships only
Highly sensitive to outliers
Strong when big x standardised scores are paired with big y standardised scores
Positive when positive x standardised scores are paired with positive y standardised scores and vice versa
What is a z-score?
SD score
How do you compare independent correlations?
Transform the r’s into z values using Fishers z transformation
What is the first step in regression?
Units must be unstandardised