3. Bivariate Correlation and Regression Flashcards
What variables can Pearson’s correlation coefficient be used with?
- Binary/categorical variables
- Continuous variables
Strength of correlations (Pearson’s r):
Weak =
Moderate =
Strong =
Weak = +/- .1
Moderate = +/- .3
Strong= +/- .5
What does r squared tell us?
Amount of shared variance
AKA: The coefficient of determination
What does the F ratio tell you (SPSS output)?
The ratio of how much the prediction of DV has improved by fitting the model, compared to how much error still remains
What is a correlation?
It is a way of measuring the extent to which two variables are linearly related
- It measures the pattern of responses across variables
What assumptions is the validity of Person correlation based on?
- data is at continuous (scale/interval/ratio) level
- data values are independent of each other; i.e. only one pair of readings per participant is used
- a linear relationship is assumed when calculating Pearson’s coefficient of correlation
- observations are random samples from normal or symmetric distributions
Non-parametric correlation
Spearman’s p (rho, rs)
- Variables are not normally distributed and the measures are on ordinal scale ( e.g. grades)
- Works by first ranking the data n(numbers converted into ranks), and then running Pearson’s r on the ranked data
Kendall’s τ (tau)
- For small datasets, many tied ranks
- Better estimate of correlation in population than Spearman’s ρ
Biserial correlation
When one variable is dichotomous, but there is an underlying continuum (e.g. pass/fail on an exam)
- A point-biserial and biserial correlation is used to correlate a dichotomy with an interval scaled variable. The difference is that the point-biserial correlation is used when the dichotomous variable is a true or discrete dichotomy and the biserial correlation is used with an artificial dichotomy
Point-biserial correlation
When one variable is dichotomous, and it is a true dichotomy (e.g. gender(?), pregnancy)
- A point-biserial and biserial correlation is used to correlate a dichotomy with an interval scaled variable. The difference is that the point-biserial correlation is used when the dichotomous variable is a true or discrete dichotomy and the biserial correlation is used with an artificial dichotomy
Partial correlation
In partial correlation the effect of the third variable on BOTH variables is controlled
- focuses on unique contributions-compares the unique variation of one variable to the unique variation of the other
Semipartial correlation
the effect of the third variable is controlled ONLY FOR ONE of the variables
- compares the unique variation of one variable with the unfiltered variation of the other
- focuses on the predictive value of
all variables combined - shows the increment in the correlation of one variable above and beyond another
What is a regression?
A way of predicting things that you have not measured
- Predicting an outcome variable from one predictor variable.
OR
- Predicting a dependent variable from one independent variable
What is the SSm
SSM is the difference between SST and SSR and represents the amount of improvement in predictions when using the Best model over the most basic model (the mean).
Differences between group means can be characterized as a regression (linear) model if:
The experimental groups are represented by a binary variable (i.e. coded 0 and 1).
in this case the predictor variables are categorical and can be expressed in a regression linear model if substituted with dummy variables