Investigating relationships Flashcards
correlation coefficient, r
measures strength of a relationship between 2 continuous variables
r=0.9
strong positive linear relationship
r=0.01
no linear relationship
r=-0.9
strong negative linear relationship
regression
association between 2 variables: estimating the line of best fit through the data minimising the sum of squared residuals
when is regression useful
- look for significant relationships btw 2 variables
2. predict a value of one variable for a given value of the other
residuals
differences btw observed and predicted
how to check the relationship btw independent & dependent variable is linear?
scatter plot
how to check the variance of the residuals around the predicted responses are the same?
scatter plot of standardised predicted values & residuals
how to check the residuals are independently normally distributed?
plot residuals in histogram
what does funnelling shape show
problems when checking residual values against predicted
independent variable
explanatory/predictor variable
dependent variable
outcome variable
parametric tests
assume data follows normal distribution
non-parametric tests
based on ranks
when are non-parametric tests used
data is ordinal data doesn't follow shape/distribution, e.g. normal skewed plot of data influential outliers small sample size
what to do when data is not normally distributed?
- use non-parametric tests
2. transform dependent variable
how to deal w positively skewed data
take log of dependent variable to produce normally distributed values